string formatting:One particularly useful string method is format. The format method is used to construct strings by inserting values into template strings. Consider this example for generating log messages for a hypothetical web server.
Basically I am accessing the weworkremotely.com, to retrieve the first item who is a span and under class “title” and and first item under class “company”.
But since there are many jobs posted on the website, I would like to retrieve all the posts and companies with same attribute class “title” and class “company”.
So instead of using read_soup.find, I should use read_soup.find_all, and a for loop to get all the items in a list.
import urllib2
from bs4 import BeautifulSoup as soup
quote_page = 'https://weworkremotely.com/'
page = urllib2.urlopen(quote_page)
read_soup = soup(page, "html.parser")
jobs = []
companys = []
name_box_job = read_soup.find_all('span', attrs={'class': 'title'})
name_box_company = read_soup.find_all('span', attrs={'class':'company'})
for n in range(len(name_box_job)):
jobs.append(name_box_job[n].get_text())
for m in range(len(name_box_company)):
jobs.append(name_box_company[m].get_text())
print jobs, companys
However, the output format from this code is very ugly, need to work on the improvement.
This is a very interesting topic and I will continue to expand on the current result, so just store the resources here and I will come back later to try them out:
When I try to run the wordcloud python library today, I received the error:” The _imagingft C module is not installed”. The reason is freetype was not installed on my Mac. Tried a lot of methods, finally the below one worked:
I have homebrew installed already,
First,
brew install freetype
Then the following files are in /usr/local/lib: libfreetype.6.dylib libfreetype.a libfreetype.dylib