Summary
Overview
Accomplishments
Websites
Skills
Work History
Education
Certification
Languages
Timeline
0b
Hira Arif

Hira Arif

Lahore, Punjab

Summary

Proven Data Extraction Specialist with a track record of enhancing data processing efficiency and accuracy for clients on Upwork. Excelled in Python and problem-solving, delivering high-quality, clean data sets using libraries such as Selenium, requests, Beautiful Soup, and Scrapy. Renowned for meticulous attention to detail and the ability to meet tight deadlines in fast-paced environments.

Overview

2
2
years of professional experience
4
4
Certification

Accomplishments

Scraping Projects:

  • Login Protected Sites e.g: SEMrush, redtoolbox
  • Puzzle Captcha (google reCaptcha v2) Protected Sites e.g: Google, claimittexas
  • Real Estate & E-commerce Sites Data e.g: Zillow , ambalaza, vrbo
  • Javascript / Dynamic Data Scraping e.g: Genius, Similarweb
  • Static Data downloading and parsing 100 Million + rows extracted since January 26, 2022.


On-going Projects:

  • Since 2023 (login-protected site), scraping on a website for clients using: 35 Remote Machines (256 GBs) | 230 IPs | 350 Threads
  • 24/7 Scraping of SEMrush and Similar Web
  • HTML Downloading, Parsing, and MongoDB Management


Bots:

  • Reddit
  • Youtube
  • Browser Automation


Cloud Infrastructure - Linux and Windows VPS:

  • Bind domain with machine
  • User permissions
  • Resource Management - Storage Volumes
  • Cron Jobs Setup
  • Database Backups and Restore

Skills

  • Python Programming Language
  • IPs Rotation
  • Concurrent Threads
  • Bypassing Captcha and Cloudflare
  • Fast and Efficient Processing for Bulk Data (Using Python Advanced Data Structures)
  • HTML Parsing and DOM manipulation
  • Data Consistency (Free from unwanted strings and AI fillers)
  • Part of Speech tagging on Millions of Tokens
  • Grammatical errors checking on 20 Million lines of text
  • Explicit Image tagging
  • HTML to Image Conversion

Work History

Data Extraction Specialist

Upwork
01.2023 - Current
  • Followed all company policies and procedures to deliver quality work.
  • Collected, arranged, and input information into database system.
  • Developed effective improvement plans in alignment with goals and specifications.
  • Frequently inspected production area to verify proper equipment operation.
  • Skills: Big Data, Python (Programming Language), Web Scraping, Selenium, Data Cleaning, Beautiful Soup, Kaggle, Data Collection, PostgreSQL, Automation, Multithreaded Development, Problem Solving, Web Servers, Data Extraction

ChatGPT Prompt Writer

Local
04.2023 - 06.2023
  • Niches: Education, Cooking, Lifestyle, Fitness, Entertainment, Ecommerce, SEO, technical, development, etc.

Technical Writer

Upwork
11.2022 - 02.2023
  • Worked under the supervision of a senior friend of mine, Mehvish Ashiq.
  • Followed company policies and editorial guidelines to craft thorough, well-written content.
  • Proved successful working within tight deadlines and a fast-paced environment.
  • Developed online tutorials and web-based training materials for software products.
  • Clients: DelftStack, Java2Blog

Education

Bachelor of Science - Software Engineering

COMSATS University Islamabad
Lahore, Punjab, Pakistan
08.2025

Certification

  • Introduction to Data Science, Cisco - June 2023
  • Python Essentials 2, Cisco - February 2023
  • Python Essentials 1, Cisco - February 2023
  • Basics of Web Scraping with Beautiful Soup for Beginners, Simplilearn - February 2023

Languages

Urdu
Native language
English
Upper intermediate
B2

Timeline

ChatGPT Prompt Writer

Local
04.2023 - 06.2023

Data Extraction Specialist

Upwork
01.2023 - Current

Technical Writer

Upwork
11.2022 - 02.2023

Bachelor of Science - Software Engineering

COMSATS University Islamabad
Hira Arif