Logo: Relish

  1. Sign in

Project: Piano-extraction

Directly pointed Manifest

When the cache.manifest is absent, the sitemap should
be obtained directly from another source.

The file format is the same as the manifest: one file
per line. The markup is also the same: the cache.manifest
stardard keywords and comment markup and the extensions for
Piano Extraction work on the given manifest.

Is not required that the manifest is served with any specific
Content-type header

Next iteration: If the manifest does not exists at all,
the robots.txt should be scanned and the site crawled
under the robots.txt directives.

Background Setup of server with no link to the manifest
Given
the "arena/directly-pointed" folder contains "index.haml" with:
!!! 5
%html
  %head
    %title Extraction with direct pointing
  %body
    %h1 Pointed directly
And
the "arena/directly-pointed/public" folder contains "extract.manifest" with:
CACHE MANIFEST
index
# Another keyword
NETWORK:
ignore_me
And
I run the server on "arena/directly-pointed"
Scenarios
  • @arena
  • @piano
Fetch with manifest (and fail)
Given
I enter the localhost server
When
I try to use the manifest
Then
I should see an error
And
the error should match /No.*manifest.*/
  • @arena
  • @piano
  • @temp
Getting the files
Given
I enter the localhost server
And
I tell the extractor to use "extract.manifest" on the server
When
I order the download on "temp"
Then
I should have "Pointed directly" in "temp/index"
And
there should not be a file "temp/CACHE MANIFEST"
And
there should not be a file "temp/NETWORK"
And
there should not be a file "temp/ignore_me"
And
there should not be in "temp" a file matching /keyword/

Last published over 7 years ago by xaviervia.