(lispkit draw web)

Library (lispkit draw web) provides web page snapshotting capabilities through a WebKit-based web client. The library enables capturing images and generating PDFs from HTML content, local files, data streams, and remote URLs with configurable viewport settings and cropping options.

This is a minimalistic example showcasing how to snapshot a web page at a given URL and saving the snapshot in a JPEG file.

(import (lispkit base)
        (lispkit draw)           ; For using `save-image`
        (lispkit draw web)       ; For web client functionality
        (lispkit thread future)) ; For handling futures
(save-image
  "~/Downloads/objecthub.jpeg"   ; Filename
  (future-get                    ; Wait for snapshot generation to finish
    (web-client-snapshot-url     ; Generate snapshot for given URL
      (make-web-client 1200)     ; Create simple web client with 1200 points width
      "http://objecthub.com"))   ; URL to snapshot
  'jpg)                          ; Save image as JPEG

Here is an example for creating an HTML document containing a reports table on the fly and then snapshotting the rendered document both as bitmap and PDF.

(import (lispkit base)
        (lispkit draw)
        (lispkit draw web) 
        (lispkit thread future))

(define (generate-web-report data-source path)
  ;; Create web client optimized for documents
  (define client (make-web-client 600 '() #f "Report Generator" 1.5))
  ;; Generate report in HTML form
  (define html
    (string-append
      "<!DOCTYPE html><html><head>"
      "<title>Report</title>"
      "<style>body { font-family: Georgia, Arial; }"
      "h1 { color: #333; padding: 4px; border-bottom: 2px solid #ccc; }"
      "table { border-collapse: collapse; width: 100%; }"
      "th, td { border: 1px solid #ddd; padding: 8px; }</style>"
      "</head><body>"
      "<h1>Data Report</h1>"
      "<table><tr><th>Item</th><th>Value</th></tr>"
      (apply string-append
             (map (lambda (item)
                    (string-append "<tr><td>" (car item) "</td>"
                                 "<td>" (cdr item) "</td></tr>"))
                  data-source))
      "</table></body></html>"))

  ;; Render and snapshot the report both as bitmap and PDF.
  (define img-snapshot
    (web-client-snapshot-html client html 'all))
  (define pdf-snapshot
    (web-client-pdf-snapshot-html client html 'all))
  
  ;; Wait for completion and save the snapshots
  (save-image
    (string-append path ".png")
    (future-get img-snapshot)
    'png)
  (write-binary-file
    (string-append path ".pdf")
    (future-get pdf-snapshot)))

;; Use the report generator
(define sample-data
  '(("Revenue" . "$125,000")
    ("Expenses" . "$89,500")
    ("Profit" . "$35,500")
    ("Growth" . "12.5%")))
(generate-web-report sample-data "~/Downloads/report")

Web clients

(make-web-client width) (make-web-client width scripts) (make-web-client width scripts viewport) (make-web-client width scripts viewport name) (make-web-client width scripts viewport name delay)

Returns a new web client object representing a web view of the given width with specified optional configuration parameters. Web clients are used to load and snapshot web pages.

scripts is a list of JavaScript strings to inject into documents rendered via the web client. Supported are the following ways to specify a JavaScript string:

  • "...": The string contains the JavaScript code. It is injected at the end of the document.

  • ("..."): The string wrapped in a pair contains the JavaScript code. It is injected at the end of the document.

  • ("..." start ): The string contains the JavaScript code which is injected at the end of the document if start is #f. Otherwise, the JavaScript code is injected at the beginning of the document.

  • ("..." start main ): The string contains the JavaScript code which is injected at the end of the document if start is #f. Otherwise, the JavaScript code is injected at the beginning of the document. Boolean main specifies if the code is injected only into the main frame or all frames.

Argument viewport specifies the view port of rendered documents. Supported are the following values:

  • (): No view port is being defined explicitly.

  • #t: The view port is constraint by the width of the client with a transparent background.

  • #f: The view port is constraint by the width of the client, setting the following other parameters: initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0, user-scalable=no. This is the default.

  • "...": The view port is defined via the following meta tag: <meta name="viewport" content="...">

Argument name defines the name of the application that appears in the user agent string. delay specifies the delay in seconds before taking snapshots to allow dynamic content loading (default: 0.0).

(web-client? obj)

Returns #t if obj is a web client object, #f otherwise.

(web-client-busy? web-client)

Returns #t if the web client is currently processing a request, #f if it's available for new snapshotting requests. Web clients process requests sequentially to avoid conflicts.

Crop Modes

The various snapshot procedures support six different crop modes for controlling what portion of the web page to capture:

  • all: Capture the entire web view bounds, including any empty space.

  • trim: Automatically detect and capture only the content area, trimming empty margins.

  • (inset top right bottom left): Crop by insetting the specified amounts from the web view edges.

  • (inset-trimmed top right bottom left): Crop by insetting the specified amounts from the automatically detected content area.

  • (rect x y width height): Capture a specific rectangular region (standard representation of rectangles).

  • ((x . y) . (width . height)): Alternative rectangle form.

  • (rect-trimmed x y width height): Rectangle relative to the trimmed content area.

Image snapshots

(web-client-snapshot-html client html crop-mode) (web-client-snapshot-html client html crop-mode width) (web-client-snapshot-html client html url crop-mode) (web-client-snapshot-html client html url crop-mode width)

Captures an image snapshot of HTML content provided by the string html to the web client client and returns a future referring to the captured image. url is the base URL used for resolving relative resources. It is optional and can be set to #f. crop-mode defines what portion of the web page is being captured. It is either a rectangular object or has one of the following forms: all, trim, (inset top right bottom left), (inset-trimmed top right bottom left), (rect _x y width height_), or (rect-trimmed x y width height). For details, see section on Crop Modes. width defines a width for the rendered document and overrides, if provided, the width defined by web client client; default is #f.

(web-client-snapshot-data client data mime) (web-client-snapshot-data client data mime encoding) (web-client-snapshot-data client data mime encoding url) (web-client-snapshot-data client data mime encoding url crop-mode) (web-client-snapshot-data client data mime encoding url crop-mode width)

Captures an image snapshot of content provided by the bytevector data of mime type mime to the web client client and returns a future referring to the captured image. encoding is a string specifying the encoding of the data; default is "UTF-8". url is the base URL used for resolving relative resources; default is "http://localhost". crop-mode defines what portion of the document is being captured. It is either a rectangular object or has one of the following forms: all, trim, (inset top right bottom left), (inset-trimmed top right bottom left), (rect _x y width height_), or (rect-trimmed x y width height); default is all. For details, see section on Crop Modes. width defines a width for the rendered document and overrides, if provided, the width defined by web client client; default is #f.

(web-client-snapshot-file client path dir) (web-client-snapshot-file client path dir crop-mode) (web-client-snapshot-file client path dir crop-mode width)

Captures an image snapshot of HTML content provided by the file at path to the web client client and returns a future referring to the captured image. dir is the base directory used for resolving relative file resources. crop-mode defines what portion of the web page is being captured. It is either a rectangular object or has one of the following forms: all, trim, (inset top right bottom left), (inset-trimmed top right bottom left), (rect _x y width height_), or (rect-trimmed x y width height); default is all. For details, see section on Crop Modes. width defines a width for the rendered document and overrides, if provided, the width defined by web client client; default is #f.

(web-client-snapshot-url client req) (web-client-snapshot-url client req crop-mode) (web-client-snapshot-url client req crop-mode width)

Captures an image snapshot of HTML content by req to the web client client and returns a future referring to the captured image. req is either a URL or an HTTP request created with library (lispkit http). crop-mode defines what portion of the web page is being captured. It is either a rectangular object or has one of the following forms: all, trim, (inset top right bottom left), (inset-trimmed top right bottom left), (rect _x y width height_), or (rect-trimmed x y width height); default is all. For details, see section on Crop Modes. width defines a width for the rendered document and overrides, if provided, the width defined by web client client; default is #f.

PDF snapshots

(web-client-pdf-snapshot-html client html) (web-client-pdf-snapshot-html client html crop-mode) (web-client-pdf-snapshot-html client html url) (web-client-pdf-snapshot-html client html url crop-mode)

Captures a snapshot of HTML content provided by the string html to the web client client and returns a future referring to a PDF document containing the captured snapshot serialized into a bytevector. url is the base URL used for resolving relative resources. It is optional and can be set to #f. crop-mode defines what portion of the web page is being captured. It is either a rectangular object or has one of the following forms: all, trim, (inset top right bottom left), (inset-trimmed top right bottom left), (rect _x y width height_), or (rect-trimmed x y width height). For details, see section on Crop Modes.

(web-client-pdf-snapshot-data client data mime) (web-client-pdf-snapshot-data client data mime encoding) (web-client-pdf-snapshot-data client data mime encoding url) (web-client-pdf-snapshot-data client data mime encoding url crop-mode)

Captures a snapshot of content provided by the bytevector data of mime type mime to the web client client and returns a future referring to a PDF document containing the captured snapshot serialized into a bytevector. encoding is a string specifying the encoding of the data; default is "UTF-8". url is the base URL used for resolving relative resources; default is "http://localhost". crop-mode defines what portion of the document is being captured. It is either a rectangular object or has one of the following forms: all, trim, (inset top right bottom left), (inset-trimmed top right bottom left), (rect _x y width height_), or (rect-trimmed x y width height); default is all. For details, see section on Crop Modes.

(web-client-pdf-snapshot-file client path dir) (web-client-pdf-snapshot-file client path dir crop-mode)

Captures a snapshot of HTML content provided by the file at path to the web client client and returns a future referring to a PDF document containing the captured snapshot serialized into a bytevector. dir is the base directory used for resolving relative file resources. crop-mode defines what portion of the web page is being captured. It is either a rectangular object or has one of the following forms: all, trim, (inset top right bottom left), (inset-trimmed top right bottom left), (rect _x y width height_), or (rect-trimmed x y width height); default is all. For details, see section on Crop Modes.

(web-client-pdf-snapshot-url client req) (web-client-pdf-snapshot-url client req crop-mode)

Captures a snapshot of HTML content by req to the web client client and returns a future referring to a PDF document containing the captured snapshot serialized into a bytevector. req is either a URL or an HTTP request created with library (lispkit http). crop-mode defines what portion of the web page is being captured. It is either a rectangular object or has one of the following forms: all, trim, (inset top right bottom left), (inset-trimmed top right bottom left), (rect _x y width height_), or (rect-trimmed x y width height); default is all. For details, see section on Crop Modes.

Last updated