Skip to content Skip to sidebar Skip to footer

Can't Fetch Some Content From A Webpage Using Post Requests

I've created a script in python in association with selenium to scrape some content located within a box like container in it's left sidebar from a webpage. When I use selenium I c

Solution 1:

Ok, I've done a bit of reverse engineering. It seems like the whole process runs on the client side. Here's how:

wave.engine.statistics contains the result you're looking for:

// wave.min.js

wave.fn.applyRules = function() {
    var e = {};
    e.statistics = {};
    try {
        e.categories = wave.engine.run(),
        e.statistics = wave.engine.statistics;
        wave.engine.ruleTimes;
        e.statistics.pagetitle = wave.page.title,
        e.statistics.totalelements = wave.allTags.length,
        e.success = !0
    } catch (t) {
        console.log(t)
    }
    return e
}

Here wave.engine.run function runs all rules on the client side. s is the <body> element:

rules

and returns the results

wave.engine.run = function(e) {
    var t = newDate
      , n = null
      , i = null
      , a = newDate;
    wave.engine.fn.calculateContrast(this.fn.getBody());
    var o = newDate
      , r = wave.rules
      , s = $(wave.page);
    if (e)
        r[e] && r[e](s);
    elsefor (e in r) {
            n = newDate;
            try {
                r[e](s)
            } catch (l) {
                console.log("RULE FAILURE(" + e + "): " + l.stack)
            }
            i = newDate,
            this.ruleTimes[e] = i - n,
            config.debug && console.log("RULE: " + e + " (" + this.ruleTimes[e] + "ms)")
        }
    returnEndTimer = newDate,
    config.debug && console.log("TOTAL RULE TIME: " + (EndTimer - t) + "ms"),
    a = newDate,
    wave.engine.fn.structureOutput(),
    o = newDate,
    wave.engine.results
}

So you have two options: port these rules into Python, or keep using Selenium.

wave.rules = {},
wave.rules.text_justified = function(e) {
    e.find("p, div, td").each(function(t, n) {
        var i = e.find(n);
        "justify" == i.css("text-align") && wave.engine.fn.addIcon(n, "text_justified")
    })
}
,
wave.rules.alt_missing = function(e) {
    wave.engine.fn.overrideby("alt_missing", ["alt_link_missing", "alt_map_missing", "alt_spacer_missing"]),
    e.find("img:not([alt])").each(function(e, t) {
        var n = $(t);
        void0 != n.attr("title") && 0 != n.attr("title").length || wave.engine.fn.addIcon(t, "alt_missing")
    })
}
// ... and many more

Since the tests rely on the browser engine to render a page fully (reports are not generated on the cloud unfortunately), you have to use Selenium for this job

Post a Comment for "Can't Fetch Some Content From A Webpage Using Post Requests"