download-by-date-hint: add yearly debugging output
FIX: use the month of the date as month, not the day
datehint: only 10 downloads per processor
add commented-out example for crawling one seed
merge
merge
working links in README
merge guile-fcp
always access the latest version
month starts at 0
remove limitation to 10 known IDs
add status for dl by date hint
avoid weeks earlier than the date in the yearly date hint
add a readme
merge
merge guile-fcp and wotdump
anonymize: use deduplicated by default
add missing newlines
add full deduplication
anonymization script.
fix stack overflow: replace flatten with map lambda map
avoid stumbling over incorrectly formatted trust values
working edge csv export, importable in gephi
parse the first 3 files
fix initial line + document more
polish a bit
write the first line of the matrix
start trust-list->csv parser
parse all downloaded IDs
can parse the trust values
run parse-trust-values
begin parser to turn the parsed files into standard formats
flatten the return value of the crawled wot
allow passing a key as seed-id
fix: forgot to give string-take the argument
debug: wot-uri-key broke when getting a key.
crawn: only redownload if #:redownload #t
clearer if
comment out debug output
crawl the whole WoT.
Use par-map + fix crawl-wot breakage
Refactor + crawl the full WoT starting at the seed-id
Crawling the full WoT by date hint works
Provide more output.
Crawl all versions and save with the date from the weekly date hint.
hopefully the final fix
workaround for a bug I don’t understand.
add: first argument gives seed
mostly working crawler
now actually working crawler
working WoT crawler
first try at a wot crawler. breaks at parsing one xml file.
merge the license file and change it to LGPL.
provide copyright information.
Init commit
space before comment sign
add the initial racket implementation, too.
make fcp.scm work
also report the line with the message name.
added fcp.scm for guile from dinky's evil twin