usp.objects.sitemap¶
Objects that represent one of the found sitemaps.
- class usp.objects.sitemap.AbstractSitemap¶
Bases:
objectAbstract sitemap.
- all_pages() Iterator[SitemapPage]¶
Return iterator which yields all pages of this sitemap and linked sitemaps (if any).
- Returns:
Iterator which yields all pages of this sitemap and linked sitemaps (if any).
- all_sitemaps() Iterator[AbstractSitemap]¶
Return iterator which yields all sub-sitemaps descended from this sitemap.
- Returns:
Iterator which yields all sub-sitemaps descended from this sitemap.
- to_dict(with_pages=True) dict¶
Return a dictionary representation of the sitemap, including its child sitemaps and optionally pages
- Parameters:
with_pages – Include pages in the representation of this sitemap or descendants.
- Returns:
Dictionary representation of the sitemap.
- abstract property pages: list[SitemapPage]¶
Return a list of pages found in a sitemap (if any).
Should return an empty list if this sitemap cannot have sub-pages, to allow traversal with a consistent interface.
- Returns:
the list of pages, or an empty list.
- abstract property sub_sitemaps: list[AbstractSitemap]¶
Return a list of sub-sitemaps of this sitemap (if any).
Should return an empty list if this sitemap cannot have sub-pages, to allow traversal with a consistent interface.
- Returns:
the list of sub-sitemaps, or an empty list.
- class usp.objects.sitemap.InvalidSitemap¶
Bases:
AbstractSitemapInvalid sitemap, e.g. the one that can’t be parsed.
- __init__(url: str, reason: str)¶
Initialize a new invalid sitemap.
- Parameters:
url – Sitemap URL.
reason – Reason why the sitemap is deemed invalid.
- to_dict(with_pages=True) dict¶
Return a dictionary representation of the sitemap, including its child sitemaps and optionally pages
- Parameters:
with_pages – Include pages in the representation of this sitemap or descendants.
- Returns:
Dictionary representation of the sitemap.
- property pages: list[SitemapPage]¶
Return an empty list of pages, as invalid sitemaps have no pages.
- Returns:
Empty list of pages.
- property reason: str¶
Return reason why the sitemap is deemed invalid.
- Returns:
Reason why the sitemap is deemed invalid.
- property sub_sitemaps: list[AbstractSitemap]¶
Return an empty list of sub-sitemaps, as invalid sitemaps have no sub-sitemaps.
- Returns:
Empty list of sub-sitemaps.
Index Sitemaps¶
- class usp.objects.sitemap.AbstractIndexSitemap¶
Bases:
AbstractSitemapAbstract sitemap with URLs to other sitemaps.
- __init__(url: str, sub_sitemaps: list[AbstractSitemap])¶
Initialize index sitemap.
- Parameters:
url – Sitemap URL.
sub_sitemaps – Sub-sitemaps that are linked to from this sitemap.
- all_pages() Iterator[SitemapPage]¶
Return iterator which yields all pages of this sitemap and linked sitemaps (if any).
- Returns:
Iterator which yields all pages of this sitemap and linked sitemaps (if any).
- all_sitemaps() Iterator[AbstractSitemap]¶
Return iterator which yields all sub-sitemaps of this sitemap.
- Returns:
Iterator which yields all sub-sitemaps of this sitemap.
- to_dict(with_pages=True) dict¶
Return a dictionary representation of the sitemap, including its child sitemaps and optionally pages
- Parameters:
with_pages – Include pages in the representation of this sitemap or descendants.
- Returns:
Dictionary representation of the sitemap.
- property pages: list[SitemapPage]¶
Return an empty list of pages, as index sitemaps have no pages.
- Returns:
Empty list of pages.
- property sub_sitemaps: list[AbstractSitemap]¶
Return a list of sub-sitemaps of this sitemap (if any).
Should return an empty list if this sitemap cannot have sub-pages, to allow traversal with a consistent interface.
- Returns:
the list of sub-sitemaps, or an empty list.
- class usp.objects.sitemap.IndexWebsiteSitemap¶
Bases:
AbstractIndexSitemapWebsite’s root sitemaps, including robots.txt and extra ones.
- class usp.objects.sitemap.IndexXMLSitemap¶
Bases:
AbstractIndexSitemapXML sitemap with URLs to other sitemaps.
- class usp.objects.sitemap.IndexRobotsTxtSitemap¶
Bases:
AbstractIndexSitemaprobots.txt sitemap with URLs to other sitemaps.
Page Sitemaps¶
- class usp.objects.sitemap.AbstractPagesSitemap¶
Bases:
AbstractSitemapAbstract sitemap that contains URLs to pages.
- __init__(url: str, pages: list[SitemapPage])¶
Initialize new pages sitemap.
- Parameters:
url – Sitemap URL.
pages – List of pages found in a sitemap.
- all_pages() Iterator[SitemapPage]¶
Return iterator which yields all pages of this sitemap and linked sitemaps (if any).
- Returns:
Iterator which yields all pages of this sitemap and linked sitemaps (if any).
- all_sitemaps() Iterator[AbstractSitemap]¶
Return iterator which yields all sub-sitemaps descended from this sitemap.
- Returns:
Iterator which yields all sub-sitemaps descended from this sitemap.
- to_dict(with_pages=True) dict¶
Return a dictionary representation of the sitemap, including its child sitemaps and optionally pages
- Parameters:
with_pages – Include pages in the representation of this sitemap or descendants.
- Returns:
Dictionary representation of the sitemap.
- property pages: list[SitemapPage]¶
Load pages from disk swap file and return them.
- Returns:
List of pages found in the sitemap.
- property sub_sitemaps: list[AbstractSitemap]¶
Return an empty list of sub-sitemaps, as pages sitemaps have no sub-sitemaps.
- Returns:
Empty list of sub-sitemaps.
- class usp.objects.sitemap.PagesXMLSitemap¶
Bases:
AbstractPagesSitemapXML sitemap that contains URLs to pages.
- class usp.objects.sitemap.PagesTextSitemap¶
Bases:
AbstractPagesSitemapPlain text sitemap that contains URLs to pages.
- class usp.objects.sitemap.PagesRSSSitemap¶
Bases:
AbstractPagesSitemapRSS 2.0 sitemap that contains URLs to pages.
- class usp.objects.sitemap.PagesAtomSitemap¶
Bases:
AbstractPagesSitemapRSS 0.3 / 1.0 sitemap that contains URLs to pages.