Skip to content

Properly ignore first col of tables in scrape_term#74

Merged
madhavarshney merged 1 commit into
OpenCourseAPI:masterfrom
madhavarshney:bugfix/scraper-first-col
May 23, 2020
Merged

Properly ignore first col of tables in scrape_term#74
madhavarshney merged 1 commit into
OpenCourseAPI:masterfrom
madhavarshney:bugfix/scraper-first-col

Conversation

@madhavarshney
Copy link
Copy Markdown
Member

No description provided.

@madhavarshney madhavarshney requested a review from phi-line May 23, 2020 09:25
Comment thread scrape_term.py
rows = t.find_all('tr', {'class': 'CourseRow'})
s = defaultdict(lambda: defaultdict(list))
for tr in rows:
cols = tr.find_all(lambda tag: tag.name == 'td' and not tag.get_text().isspace())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🏳️ 🏳️ 🏳️

Comment thread scrape_term.py

try:
key = get_key(f'{cols[0] if cols[0] else cols[1]}')[0]
key = get_key(cols[0])[0]
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, not sure if this is necessary / what it breaks.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is okay now because you popped the first.
Can you test locally to see if it returns a desired result?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to my testing, it seems to work fine. (Tested with LiveMyPortalData)

Comment thread scrape_term.py
@madhavarshney madhavarshney requested a review from phi-line May 23, 2020 09:34
Copy link
Copy Markdown
Collaborator

@phi-line phi-line left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@madhavarshney madhavarshney merged commit 50047e1 into OpenCourseAPI:master May 23, 2020
@madhavarshney madhavarshney deleted the bugfix/scraper-first-col branch August 9, 2020 05:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants