import pandas as pd
import scrapy
from scrapy.crawler import CrawlerProcess
from scrapy.utils.log import configure_logging
class FreshThymeBaconSpider(scrapy.Spider):
= 'Fresh Thyme Market Bacon Spider'
name
def start_requests( self ):
= ['https://ww2.freshthyme.com/sm/planning/rsid/951/results?q=Bacon&take=48&f=Category%3AHot+Dogs%2C+Bacon+%26+Sausage']
start_urls for url in start_urls:
yield scrapy.Request( url = url, callback = self.cardsParse)
def cardsParse(self, response):
#Fail safe for links
try:
#grabs all cards from list and saves the link to follow
= '//*[contains(@class,"Listing")]/div/a/@href'
xpath = response.xpath(xpath)
listCards
linklist.append(listCards.extract())for url in listCards:
yield response.follow( url = url, callback = self.itemParse, meta={'link': url} )
except AttributeError:
pass
def itemParse(self, response):
#xpaths to the name and price
= '//*[contains(@class, "PdpInfoTitle")]/text()'
nameXpath = '//*[contains(@class, "PdpMainPrice")]/text()'
priceXpath = response.meta.get('link')
url #Grabs the name and price from the xpaths and adds them to the bacon list
'bacon': response.xpath(nameXpath).extract(), 'price': response.xpath(priceXpath).extract()})
bacon.append({
# Start
configure_logging()= []
bacon = []
linklist = CrawlerProcess()
process
process.crawl(FreshThymeBaconSpider)
process.start()
process.stop()= pd.DataFrame(bacon)
baconFrame print(baconFrame)
Current Project Objectives
The currents project objectives for this week was to
Catch up on any additional training.
Collect and find data on heirloom tomatoes, eggs, and bacon.
Learning how to do web scraping in python through Datacamp.
Building programs to do web scraping
Works in Progress
An excel sheet that contains a list of small businesses of Iowa grocers that we had to go through and find places that had the data we wanted.
These are some examples of what we were looking for
Along with some manual data scraping. We started work on some data scraping programs (spiders). This code block is a spider that I recently created. However, there are still some improvements that still need to be made.
This image shows an example output of data we were able to scrape.
DSPG Questions
Are there stores or market places that would be helpful for us to look into?
Is anyone experienced in web scraping and if so there any advice that you have for us?