How to Analyze Stock Data using Python | Learn Python
Week 1. Basic Python, Business Workflow Automation
We are going to use Google Colab throughout this lesson. Colab enables you to code on your browser without installing python. You can also save your code on your Google Docs.
Python grammar is quite straightforward. For example, print(‘hello world’) results hello world. Python is used to analyze data, program server, and automate business.
Now, click the link above and create a new note(file). Then, rename it to Colab — Week 1.
# Ctrl+Enter is a key shortcut to run the code.
Print('hello world') #hello world
01. Basic Python (1)
a) Variable & Basic Operators
a = 3 # put 3 to a
b = a # put b to a
a = a + 1 # put a+1 to a again
num1 = a*b # put value of a*b to variable num1
num2 = 99 # put 99 to variable num2
# You can name the variable as you like but it should be recognizable and not too complicated
b) List, Dictionary
- List : order matters!
a_list = ['Apple','Pear','Orange','Watermelon']
a_list[0]
a_list.append('Strawberry')
a_list[4]
- Dictionary : { key : value } -> format matters!
a_dict = {'name':'bob','age':21}
a_dict['age']
a_dict['height'] = 178
a_dict
- Combination of Dictionary and List
people = [{'name':'bob','age':20},{'name':'carry','age':38}]
# people[0]['name']? 'bob'
# people[1]['name']? 'carry'
person = {'name':'john','age':7}
people.append(person)
# people? [{'name':'bob','age':20},{'name':'carry','age':38},{'name':'john','age':7}]
# people[2]['name']? 'john'
c) Function
#Math Function
f(x) = 2*x+3
y = f(2)
Value of y? 7
#Javascript
function f(x) {
return 2*x+3
}
#Python
def f(x):
return 2*x+3
y = f(2)
Value of y? 7
def sum(a,b):
return a+b
def mul(a,b):
return a*b
result = sum(1,2) + mul(10,10)
# What's the value of variable result?
02. Basic Python (2)
a) Conditional Statement
if age > 20:
print('Adult') #If true, then 'Adult'
else:
print('Teen') #If false, then 'Teen'
is_adult(30)
b) Repetitive Statement
ages = [20,30,15,5,10]
for age in ages:
print(age)
###
for age in ages:
if age > 20:
print('Adult')
else:
print('Teen')
- Conditional Statement + Function + Repetitive Statement
def check_adult(age):
if age > 20:
print('Adult')
else:
print('Teen')
ages = [20,30,15,5,10]
for age in ages:
check_adult(age)
03. Business Workflow Automation — Clipping (1)
step1) Outline and strategy of Web Scrapping
Click the link below. Right click on your mouse, and go to properties. Then, you’ll see the structure of HTML. Hover around your mouse to see which part of the HTML code refers to which part of the page. You don’t have to remember the structure of the HTML. Typically, the article is inside of the box called <li>.
For users who are not Korean, feel free to use other sites to search stock such as Tesla instead of Samsung Electronics.
step 2) Installing the library
To use a library that others have created, you should install them first.
pip install bs4 requests # run this first
#Just copy and paste this code to you Colab.
import requests
from bs4 import BeautifulSoup
headers = {'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'}
data = requests.get('https://search.naver.com/search.naver?where=news&ie=utf8&sm=nws_hty&query=삼성전자',headers=headers) #pulling up the information using requests
soup = BeautifulSoup(data.text, 'html.parser') #BeautifulSoup makes easier to analyze
- Just remember select_one and select
a = soup.select_one('#sp_nws1 > div.news_wrap.api_ani_send > div > a')
a['href']
a.text
04. Business Workflow Automation — Clipping (2)
step 3) Organize
#Let's get multiple articles
lis = soup.select('#main_pack > section > div > div.group_news > ul > li')
#get <a> tag from the HTML page
lis[0].select_one('a.news_tit')
#Using Repetitive statement we can organize the code like so:
lis = soup.select('#main_pack > section > div > div.group_news > ul > li')
for li in lis:
a = li.select_one('a.news_tit')
print(a.text, a['href'])
#Create a function with keyword (you can change the keyword)
def get_news(keyword):
headers = {'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'}
data = requests.get(f'https://search.naver.com/search.naver?where=news&ie=utf8&sm=nws_hty&query={keyword}',headers=headers)
soup = BeautifulSoup(data.text, 'html.parser')
lis = soup.select('#main_pack > section > div > div.group_news > ul > li')
for li in lis:
a = li.select_one('a.news_tit')
print(a.text, a['href'])
If you change the keyword from the function to “현대자동차” (Hyundai Motors) and run it, the result will look like this:
Your code should look like the image below so far:
05. Business Workflow Automation — Excel (1)
step 1) install openpyxl library
pip install openpyxl #It's also a code that someone has created.
step 2) Create Excel file
#copy and paste this code to your Colab
from openpyxl import Workbook
wb= Workbook()
sheet = wb.active
sheet['A1'] = '안녕하세요!'
wb.save("샘플파일.xlsx")
wb.close()
Refresh your file tab. Then you’ll see the file that you just created.
#Now Let's add data to the file
from openpyxl import Workbook
wb = Workbook()
sheet = wb.active
row = [1,'사과',300]
sheet.append(row)
sheet.append(row)
sheet.append(row)
sheet.append(row)
wb.save("샘플파일.xlsx")
wb.close()