ARTICLE AD BOX
You search ul in wrong way.
You search h2 with text Dialogue and next you search sibling ul but ul with dialogs is not sibling because it is inside div.
There is sibling ul but after Location and you get text from location.
You have to search ul in <div id="needolin-dialogues> (without "sibling").
You may even search directly li in this div - ie. using CSS selector.
soup.select("div#needolin-dialogues li")Working function:
def scrape_enemy_dialogue(self, url): #print("url:", url) soup = self.fetch_page(url) dialogue = [ li.get_text(strip=True) for li in soup.select("div#needolin-dialogues li") ] #print("dialogue:", dialogue) return dialoguePart of output:
Scraping Mossgrub... url: https://hollowknight.wiki/w/Mossgrub dialogue: ['Protect us, mother!', 'Call for danger, hide away...', 'Young must eat, grow or die...', 'Sleep and change, have no fear...'] Scraping Massive Mossgrub... url: https://hollowknight.wiki/w/Massive_Mossgrub dialogue: ["Mother's voice... distant...", 'Little sisters... hide away...', 'Eat and grow... larger...', 'Change... hidden change...'] Scraping Mossmir... url: https://hollowknight.wiki/w/Mossmir dialogue: ['Protect us, mother!', 'Call for danger, hide away...', 'Young must eat, grow or die...', 'Sleep and change, have no fear...']By the way:
the same with find() find_all()
def scrape_enemy_dialogue(self, url): soup = self.fetch_page(url) dialogue = [] div = soup.find("div", id="needolin-dialogues") # div = soup.find("div", {"id": "needolin-dialogues"}) if div: dialogue = [li.get_text(strip=True) for li in div.find_all("li")] return dialogue