How to iterate over the content of an HTML page to drive some other API calls?

Trying to understand how to best parse HTML content in Postman test script.

So far, I am able to print the content to the console, but I did not find an elegant way to iterate through two type of element.

My HTML content is a with a bunch of lines. Each line (tr) contains multiple elements (td) and I would like to extract the name in the link (text inside ) if the line is Active - for this first line, getting the value 28683A1

    <tbody>
                                <tr id="row_1" class="tr-content">
                                                            <td class="tr-content-td ">
                                    &nbsp;<a href="index.phtml?page=company&amp;cny=29383900">28683A1</a>&nbsp;
                                                </td>
                                            <td title="28683A1" class="tr-content-td ">
                28683A1                                </td>
                                            <td title="production" class="tr-content-td ">
                production                                </td>
                                            <td title="0" class="tr-content-td tr-content-td-nr">
                0                                </td>
                                            <td title="MAIN_OWNER_02" class="tr-content-td ">
                MAIN_OWNER_02                                </td>
                                            <td title="Active" class="tr-content-td ">
                Active                                </td>

How I found the problem:

I’ve already tried:

// I need to filter on those values with Active as status value

// path: tbody/tr/td/a/value

const $ = cheerio.load(pm.response.text())

const tbody = $(".tr-content-td a").text();

console.log(tbody); // this dump all the text, but I have not find the way to make it conditional on the other field.

The logic I am trying to build is to get a list of , then for each line, if I have a with type Active, then I would extract the text associated with the first element inside that .

But not sure where to find this in the documentation of Postman or elsewhere.

First of all you need to ake your response as a JSON var
Then manipulate the result to get the value

const $ = cheerio.load(pm.reponse.txt());
var my_number = $('tr [class=tr-contet])

Got this code to work eventually.

const $ = cheerio.load(pm.response.text())

// I need to filter on those values with `Active` as status value
// get the list of `Active` nodes
const trList = $('tbody tr');
_.forEach(trList, function(line){ 
    //console.log($(line.children[11]).attr('title'));
    if($(line.children[11]).attr('title') === "Active"){
        // for each of them, extract the value.
        // need to confirm it's a link kind
        const value = $(line.children[1]).find('a').text();
        // exclude some patterns from the list
        if(!value.startsWith('XMLGatewayTest')){
            companyArray.push(value);
            dataFound=true;
        }
    }
})

It was a lot of trial and error, and I was missing some tutorial to help me.

Hey @erajkovic, welcome to the community :tada:

I see you have been doing amazing work with cheerio library, and that’s a perfect path to get the work done (especially with HTML content). Also, I think you have already got the solution but to make it more smooth and better here are some links -

  1. Web scraping with cheerio.js.
  2. Parsing HTML response.
  3. Web Scrapper or Parser using postman.
  4. Parsing HTML form response data and form attribute.

Cheers :beers: