Skip to content Skip to sidebar Skip to footer

How To Collect All Of The Ancor Href Using Scrapy?

enter image description here i try to find this in scrapy shell >>>scrapy shell https://www.trendyol.com/trendyol-man/antrasit-basic-erkek-bisiklet-yaka-oversize-kisa-koll

Solution 1:

As mentioned by Fazlul the data is generating dynamically (more specifically images and reviews only). Using chrome dev tools, you can find this API https://public.trendyol.com/discovery-web-productgw-service/api/productGroup/68379869 easily. Now, you are good to go.

Code

from scrapy import Request


classTrendyol(scrapy.Spider):
    name = 'test'
    domain_name = "https://www.trendyol.com"defstart_requests(self):
        url = "https://public.trendyol.com/discovery-web-productgw-service/api/productGroup/68379869"yield Request(url=url, callback=self.parse)

    defparse(self, response):
        json_text = json.loads(response.body)
        data = json_text.get('result').get("slicingAttributes")[0].get("attributes")
        for i in data:
            full_url = self.domain_name+i['contents'][0]['url']
            print(full_url)

Solution 2:

You can get Images this way in scrapy shell. The site is using API to get data

>>>scrapy shell https://public.trendyol.com/discovery-web-productgw-service/api/productGroup/68379869>>>import json>>>raw_images = json.loads(response.text)>>>raw_images = raw_images['result']["slicingAttributes"][0]["attributes"]>>>["https://cdn.dsmcdn.com"+image['contents'][0]['imageUrl'] for image in raw_images]

output:

['https://cdn.dsmcdn.com/ty62/product/media/images/20210128/20/58099823/135399582/5/5_org_zoom.jpg','https://cdn.dsmcdn.com/ty129/product/media/images/20210616/9/101392400/186966992/1/1_org_zoom.jpg','https://cdn.dsmcdn.com/ty63/product/media/images/20210128/20/58099823/135399574/4/4_org_zoom.jpg','https://cdn.dsmcdn.com/ty106/product/media/images/20210426/18/83152826/164609399/1/1_org_zoom.jpg','https://cdn.dsmcdn.com/ty62/product/media/images/20210128/20/58099823/135399570/4/4_org_zoom.jpg','https://cdn.dsmcdn.com/ty63/product/media/images/20210128/20/58099823/135399586/5/5_org_zoom.jpg','https://cdn.dsmcdn.com/ty106/product/media/images/20210426/18/83152826/164609404/1/1_org_zoom.jpg','https://cdn.dsmcdn.com/ty76/product/media/images/20210323/17/74722131/151899173/1/1_org_zoom.jpg','https://cdn.dsmcdn.com/ty62/product/media/images/20210129/19/58452645/135399594/3/3_org_zoom.jpg','https://cdn.dsmcdn.com/ty62/product/media/images/20210128/20/58099823/135399598/4/4_org_zoom.jpg','https://cdn.dsmcdn.com/ty48/product/media/images/20210329/20/76027592/151899177/1/1_org_zoom.jpg','https://cdn.dsmcdn.com/ty107/product/media/images/20210426/18/83152826/164609413/2/2_org_zoom.jpg','https://cdn.dsmcdn.com/ty69/product/media/images/20210323/17/74722131/151899169/1/1_org_zoom.jpg','https://cdn.dsmcdn.com/ty64/product/media/images/20210128/20/58099823/135399578/4/4_org_zoom.jpg','https://cdn.dsmcdn.com/ty62/product/media/images/20210128/20/58099823/135399590/4/4_org_zoom.jpg','https://cdn.dsmcdn.com/ty105/product/media/images/20210426/18/83152826/164609408/1/1_org_zoom.jpg','https://cdn.dsmcdn.com/ty85/product/media/images/20210312/17/70978132/149257621/1/1_org_zoom.jpg','https://cdn.dsmcdn.com/ty131/product/media/images/20210616/9/101407912/186952893/1/1_org_zoom.jpeg','https://cdn.dsmcdn.com/ty135/product/media/images/20210628/8/104826549/186953020/1/1_org_zoom.jpg','https://cdn.dsmcdn.com/ty64/product/media/images/20210128/20/58099823/135399562/1/1_org_zoom.jpg','https://cdn.dsmcdn.com/ty68/product/media/images/20210214/13/62120233/138664105/1/1_org_zoom.jpg','https://cdn.dsmcdn.com/ty105/product/media/images/20210421/10/81841504/135399565/1/1_org_zoom.jpg']

Post a Comment for "How To Collect All Of The Ancor Href Using Scrapy?"