Week 9
Week 9: Web Scraping & Requests - 40 Python Snippets
This week covers HTTP requests, BeautifulSoup, extracting web data, and regex for pattern matching. Below are 40 Python snippets categorized into 4 topics (10 per topic).
1️⃣ HTTP Requests with requests (10 Snippets)
requests (10 Snippets)1. Install the requests Library
requests Librarypip install requests2. Sending a GET Request
import requests
response = requests.get("https://jsonplaceholder.typicode.com/todos/1")
print(response.json()) # Output JSON response3. Sending a POST Request
import requests
data = {"title": "New Task", "completed": False}
response = requests.post("https://jsonplaceholder.typicode.com/todos", json=data)
print(response.json())4. Sending a PUT Request (Updating Data)
5. Sending a DELETE Request
6. Handling HTTP Headers
7. Handling Timeouts in Requests
8. Downloading an Image
9. Using Session for Persistent Requests
10. Handling Redirects in Requests
2️⃣ Parsing HTML with BeautifulSoup (10 Snippets)
11. Install BeautifulSoup
12. Basic HTML Parsing with BeautifulSoup
13. Fetching and Parsing a Web Page
14. Extracting All Links from a Page
15. Extracting All Image URLs from a Page
16. Extracting Text from a Page
17. Finding Elements by Class Name
18. Finding Elements by ID
19. Extracting Table Data
20. Handling JavaScript-Rendered Pages (Using Selenium)
3️⃣ Extracting Data from Web Pages (10 Snippets)
21. Extracting Specific Elements
22. Extracting Meta Tags
23. Extracting Structured Data (JSON-LD)
24. Extracting Data from Lists
25. Extracting Headings (H1, H2, H3, etc.)
26. Extracting Data from Multiple Pages (Pagination Handling)
27. Web Scraping with User-Agent Spoofing
28. Extracting Links with Specific Text
29. Extracting Email Addresses from Web Pages
30. Saving Extracted Data to a CSV File
4️⃣ Regex (re module) for Pattern Matching (10 Snippets)
re module) for Pattern Matching (10 Snippets)31. Importing the re Module
re Module32. Finding All Words in a String
33. Extracting Phone Numbers
34. Validating Email Addresses
(And more regex snippets…)
Last updated