PDF.JS

Rendering PDFs using the open web

PDF.JS logo here

About me

- Julian Viereck | @jviereck | +Juilan Viereck -

Bespin | Skywriter | Ace

Firefox

ETH

PDF.JS

PDF.JS logo here

Sorry!

  • Won't dive deep into HOW we achieve stuff
  • Hope talk gets accepted for JSConf.EU ;)

What Is The PDF.JS Project

  • building faithful & efficient PDF viewer
  • HTML5 technology experiment
  • no native code
  • Mozilla Labs Project - Open Source (GitHub)
  • Not Firefox-Specific - all modern browsers
  • 1.4 MB uncompressed JS, > 35`000 lines of code
  • viewer in different languages

Actually, It's Many Projects

  • JPEG2000
  • JBig2
  • Flate-/Predictor-/LZW-Stream
  • PostScript parser
  • Bidi algorithm
  • Font rewriter
  • + "normal" PDF parser

Demo

Overview Processing

//
// Fetch the PDF document from the URL using promices
//
PDFJS.getDocument('helloworld.pdf').then(function(pdf) {
  // Using promise to fetch the page
  pdf.getPage(1).then(function(page) {
    var scale = 1.5;
    var viewport = page.getViewport(scale);

    //
    // Prepare canvas using PDF page dimensions
    //
    var canvas = document.getElementById('the-canvas');
    var context = canvas.getContext('2d');
    canvas.height = viewport.height;
    canvas.width = viewport.width;

    //
    // Render PDF page into canvas context
    //
    var renderContext = {
      canvasContext: context,
      viewport: viewport
    };
    page.render(renderContext);
  });
});
  					

How To Render?

  • Text
  • Images
  • Drawing commands

Text

  • There are lots of different font formats!
  • Fonts are converted to OpenType
  • use CSS for loading:
    
    @font-face {
      font-family:'font0'; src:url(data:font/opentype;base64, ...)
    }
                      
  • Fonts are checked by browser
  • Need to rebuild malformed fonts :/

Images

  • JPEG streams:
    
    var DOMImg = document.createElement('img');
    var byteStr = bytesToString(bytes);
    var data = window.btoa(byteStr);
    DOMImg.src = 'data:image/jpeg;base64,' + data;
    
  • Not JPEG streams:
    var canvas = new ScratchCanvas(width, height);
    var imgData = canvas.getImageData(0, 0, width, height);
    var jpegBytes = jpeg2000Stream.getBytes();
    fillWithPixelData(jpegBytes, imgData);
    canvas.putImageData(imgData, 0, 0);
                    

Drawing Commands

  • Partial Evaluator builds OperationList
  • Canvas backend executes OperationList itmes

Why Are You Doing This?

  • aka. isn't there a C/C++ library?
  • aka. isn't JavaScript too slow?

Pretty Simple:

Performance is not the only measure

Why Are You Doing This?

Push WebPlatform: Printing

  • Printing on the web very limited right now
  • No way to achieve native printing experience
  • No way to define content once printing started
  • NEED: New API for printing
    • mozPrintCallback
    • does not change print layout
    • define canvas content during printing
    • send drawing commands "directly" to printer

Example: mozPrintCallback

// Called when the canvas gets printed.
canvas.mozPrintCallback = function(obj) {
  // Get the rendering context from the print object
  var ctx = obj.context;
  ctx.fillStyle = 'red';
  ctx.fillRect(10, 10, 100, 100);

  (function renderFunc() {
    someRendering(ctx);

    // If someting went wrong, then abort(), which aborts printing.
    if (errorHappend) {
      obj.abort();
    }
    // Once everything is rendered, tell the backend things are done().
    else if (finishedRendering) {
      obj.done();
    }
    // Otherwise, continue rendering after some timeout.
    else {
      setTimeout(renderFunc, 10);
    }
  })();
};

Text-Rendering-Madness!

Firefox Integration

Firefox Integration

  • PDF.JS as bundled Addon in Firefox Nightly, Aurora, Beta
  • Getting in Release Channel is hard
    • 450M users have expectations
    • more testing coverage
    • accessibility
    • match UX expectation
    • fallback if something is not working
  • Firefox specific, but improves overall quality of project

Tooling

  • Use Jasmin
  • Use make - but shelljs!
  • Use botio
    • Listens for /botio command
    • Linting & Jasmin tests
    • Reference tests
    • Build viewer, addon
    • Updates addon on merge

Example

PDF.JS Pull Request 1892

What's next?

  • Fix broken PDFs
  • Printing support (FF17?)
  • Text search
  • Form support
  • Improve performance
  • Improve text selection

Contributing

  • Translation
  • Testing (extension for Firefox & Chrome)
    • Bugzilla: ONLY Firefox-Integration related bugs
    • GitHub: Everything else
  • Writing Code
    • Make viewer more awesome
    • Make viewer embeddable
    • Look for memory leaks
    • Implement missing PDF spec
    • Server rendering
    • ...

Getting connected

GitHub: https://github.com/mozilla/pdf.js
Look at readme, wiki
Twitter: @pdfjs
Mailing List: https://groups.google.com/group/...
IRC: irc.mozilla.org #pdfjs
Engineering Weekly Call: Thursday - 10:00am PDT

One more thing...

Thanks!


Questions?