Hi so I am trying to scrape data from the pga website to give me a CSV of information on golf courses. I tried something new and used module re and pandas instead of beautiful soup to access the data. I am having a problem writing a CSV file. I tried using pandas dataframe module but I have been getting an attribute error. with my current scheme, It is giving me an attribute error when encoding the Utf-8 and I was wondering if I should break my scrapers into try/except blocks just. Lastly how can i create a progress bar for my sanity while waiting for it to be scraped. Ideas thoughts will be greatly appreciated.
Code cited below:
import re
import requests
import pandas as pd
import csv
L = []
for i in range(1): # Number of pages plus one
url = "http://ift.tt/1TSyPTR".format(i)
r = requests.get(url)
name = re.findall('(?<=<div class="views-field-title"><span class="field-content">)([^<]+)', r.text)
print (name)
address1 = re.findall('(?<=<div class="views-field-address"><span class="field-content">)([^<]+)', r.text)
address2 = re.findall('(?<=<div class="views-field-city-state-zip"><span class="field-content">)([^<]+)', r.text)
ownership = re.findall('(?<=<div class="views-field-course-type"><span class="field-content">)([^<]+)',r.text)
website = re.findall('(?<=<div class="class":"views-field-website"><span class="field-content">)([^<]+)',r.text)
phone = re.findall('(?<=<div class="class":"views-field-work-phone"><span class="field-content">)([^<]+)',r.text)
#L.extend(zip(name,address1,address2,ownership,website,phone))
course=[name,address1,address2,ownership,website,phone]
L.append(course)
with open ('Testing.csv','a') as file:
writer=csv.writer(file)
for row in L:
writer.writerow([s.encode("utf-8") for s in row])
Aucun commentaire:
Enregistrer un commentaire