Puppeteer Puppeteer is much

Explore practical solutions to optimize last database operations.
Post Reply
poxoja9630
Posts: 10
Joined: Sun Dec 22, 2024 5:31 am

Puppeteer Puppeteer is much

Post by poxoja9630 »

return parensRegex.test(link.children[0].data); }; (async () => { const response = await got(vgmUrl); const $ = cheerio.load(response.body); $('a').filter(isMidi).filter(noParens).each((i, link) => { const href = link.attribs.href; console.log(href); }); })(); Here you can observe that the use of functions to filter content is built into Cheerio's API, so we don't need any additional code to convert the collection of items into an array. Replace the code in index.jswith this new code, and run it again. It should run noticeably faster because Cheerio is a lighter library. If you want more detailed information, check out the other tutorial I wrote on using Cheerio .

Puppeteer Puppeteer is much more different from the previous two in that telegram philippines girl it is primarily a library for headless browser scripting (writing scripts for browsers without a visual user interface). Puppeteer provides a high-level API to control Chrome or Chromium via the DevTools protocol . It is much more versatile because you can write code to interact with and manipulate web applications rather than just reading static data. Install it with the following command: Bash Copy the code npm install [email protected] The difference with web scraping via Puppeteer - compared to the previous two tools - is that: rather than writing code to retrieve the raw HTML of a URL and pass it to an object, you write code that will run in the context of a browser that processes the HTML of a given URL and builds a real Document Object Model (DOM) from it.

Image

The following code snippet instructs Puppeteer's browser to go to the URL we want and access all the hyperlink elements we analyzed earlier: JavaScript Copy the code const puppeteer = require('puppeteer'); const vgmUrl =
/console/nintendo/nes'; (async () => { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto(vgmUrl); const links = await page.$$eval('a', elements => elements.filter(element => { const parensRegex = /^((?!\().)*$/; return element.href.includes('.mid') && parensRegex.test(element.textContent); }).map(element => element.href)); links.forEach(link => console.log(link)); await browser.close(); })(); Note: We still write logic to filter links on the page, but instead of declaring more filter functions, we just do it inline.
Post Reply