2024-01-05 22:55:59 +01:00
|
|
|
# ghc-tagsoup
|
|
|
|
|
2024-01-05 22:56:02 +01:00
|
|
|
TagSoup is a library for parsing and extracting information from (possibly malformed) HTML/XML documents.
|
|
|
|
It supports the HTML 5 specification, and can be used to parse either well-formed XML, or unstructured and malformed HTML from the web.
|
|
|
|
The library also provides useful functions to extract information from an HTML document, making it ideal for screen-scraping.
|
2024-01-05 22:56:01 +01:00
|
|
|
|