DH-2

From Jon's Wiki

In 1950-59, 4037 publications in NZ (publication = pamphlet of 5+ pages, or a novel). Where are they, how do we get hold of them, and can they be freed?

  • Publications NZ - has all of the things, in MARC. Federated to Worldcat. US Copyright law restricts stuff since 1870.
  • Some stuff will need consultation with Iwi
  • Reclaiming New Zealand's Digitised Heritage project - Kiwi Alex
  • Today's bibliographies from libraries may not match online or physical storage. Stuff can be in stack, lost, moved, destroyed, storage, etc. It may be possible to use cross-catalogue data or cross-media data to track books down (e.g. mentions in Papers Past)
  • Maybe leave out unpublished data
  • How do we get institutions to lend the books to digitise? NL won't lend out valuable material without conservator reporting.

What format, and how do we make it text-searchable?

  • Agree on something like METS-ALTO and DC, and federate with OAI-PMH and/or use Digital NZ.
  • NL scanned 300dpi colour TIFF images per page, into PDF with page image + OCR.
  • e-books in EPUB, which is (more or less) zipped HTML. Kindle uses MobiPocket, another format based on Open eBook.
  • ODPS is a syndication format - like RSS but for e-books.
  • http://stats.govt.nz/ ← fully searchable open access (XML) yearbook data.
  • TEI XML can be huge and possibly redundant for many use cases - http://www.tei-c.org/
  • Gutenberg Project offer many formats - but some are auto-generated from a master

How do we make it available online?

  • Some data will be very specific and of little commercial or even research value.
  • Others will have high commercial value

Some have been digitised already.

  • Digitisation efforts are already under way at National Library.
  • Some stuff is in Google Books/Hathi Trust, but sourced from the US (little comms with Nat Lib NZ), can be tricky to get stuff from them
  • Some is public domain, some copyrighted

Do we want to cover periodicals?

  • Quite probably yes!
  • RILM are going about digitising.

Who will host it, who will "own" it and maintain it over time?

  • National Library seems the most sensible fit
  • Public/Private collaboration; should we delete data that is objected to by one or two stakeholders?
    • Reliance on corporate law and upholding contracts
    • Could the private organisations be not-for-profit, or similarly chartered?
    • Retain public ownership

Audience

Where is the data for matching your message, with the right audience, and the platform they tend to use? For example for 16 year-olds, we need Facebook, but not so much for retirees?!

  • Media studies - in primary and secondary education, there are "bring your own device" initiatives which may have good data.
  • Sometimes lack of demand is because people don't know it's available and/or don't know they're looking for it
  • Local information could be sliced-and-diced by locality, person, and so on (semantic metadata) and be highly relevant to the punters.