# (lispkit char-set)

Library `(lispkit char-set)` implements efficient means to represent and manipulate sets of characters. Its design is based on SRFI 14 but the implementation is specific to the definition of characters in LispKit; i.e. library `(lispkit char-set)` assumes that characters are UTF-16 code units.

As opposed to SRFI 14, it can be assumed that the update procedures ending with "!" are mutating the corresponding character set. This means that clients of these procedures may rely on these procedures performing their functionality in terms of side effects.

In the procedure specifications below, the following conventions are used:

* A *cs* parameter is a character set.
* A *s* parameter is a string.
* A *char* parameter is a character.
* A *char-list* parameter is a list of characters.
* A *pred* parameter is a unary character predicate procedure, returning either `#t` or `#f` when applied to a character.
* An *obj* parameter may be any value at all.

Passing values to procedures with these parameters that do not satisfy these types is an error.

Unless otherwise noted in the specification of a procedure, procedures always return character sets that are distinct from the parameter character sets (unless the procedure mutates a character set and its name ends with "!"). For example, `char-set-adjoin` is guaranteed to provide a fresh character set, even if it is not given any character parameters.

Library `(lispkit char-set)` supports both mutable as well as immutable character sets. Character sets are assumed to be mutable unless they are explicitly specified to be immutable.

## Constants

**char-set:lower-case** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-bdc0997c38ced7c944ea089918006133f1a4052f%2Fconst.png?alt=media" alt="" data-size="line">\
**char-set:upper-case**\
**char-set:title-case**\
**char-set:letter**\
**char-set:digit**\
**char-set:letter+digit**\
**char-set:graphic**\
**char-set:printing**\
**char-set:whitespace**\
**char-set:newlines**\
**char-set:iso-control**\
**char-set:punctuation**\
**char-set:symbol**\
**char-set:hex-digit**\
**char-set:blank**\
**char-set:ascii**\
**char-set:empty**\
**char-set:full**

Library `(lispkit char-set)` predefines these frequently used immutable character sets.

Note that there may be characters in `char-set:letter` that are neither upper or lower case. The `char-set:whitespaces` character set contains whitespace and newline characters. `char-set:blanks` only contains whitespace (i.e. "blank") characters. `char-set:newlines` only contains newline characters.

**char-set-type-tag** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-28888abe04e78015f951d1fed77a8116c810cfdc%2Fobj.png?alt=media" alt="" data-size="line">

Symbol representing the `char-set` type. The `type-for` procedure of library `(lispkit type)` returns this symbol for all character set objects.

## Predicates

**(char-set?&#x20;*****obj*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Returns `#t` if *obj* is a character set, otherwise returns `#f`.

**(char-set-empty?&#x20;*****cs*****)**   <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Returns `#t` if the character set *cs* does not contain any characters, otherwise returns `#f`.

**(char-set=?&#x20;*****cs ...*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Returns `#t` if all the provided character sets *cs ...* contain the exact same characters; returns `#f` otherwise. For both corner cases, `(char-set=?)` and `(char-set=? cs)`, `char-set=?` returns `#t`.

**(char-set<=?&#x20;*****cs ...*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Returns `#t` if every character set *cs-i* is a subset of character set *cs-i+1*; returns `#f` otherwise. For both corner cases, `(char-set<=?)` and `(char-set<=? cs)`, `char-set<=?` returns `#t`.

**(char-set-disjoint?&#x20;*****cs1 cs2*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Returns `#t` if character sets *cs1* and *cs2* are disjoint, i.e. they do not share a single character; returns `#f` otherwise.

**(char-set-contains?&#x20;*****cs char*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Returns `#t` if character *char* is contained in character set *cs*; returns `#f` otherwise.

**(char-set-every?&#x20;*****pred cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">\
\&#xNAN;**(char-set-any?&#x20;*****pred cs*****)**

The `char-set-every?` procedure returns `#t` if predicate *pred* returns `#t` for every character in the character set *cs*. Likewise, `char-set-any?` applies *pred* to every character in character set *cs*, and returns `#t` if there is at least one character for which *pred* returns `#t`. If no character produces a `#t` value, it returns `#f`. The order in which these procedures sequence through the elements of *cs* is not specified.

## Constructors

**(char-set&#x20;*****char ...*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Return a newly allocated mutable character set containing the given characters.

**(immutable-char-set&#x20;*****char ...*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Return a newly allocated immutable character set containing the given characters.

**(char-set-copy&#x20;*****cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">\
\&#xNAN;**(char-set-copy&#x20;*****cs mutable?*****)**

Returns a newly allocated copy of the character set *cs*. The copy is mutable by default unless parameter *mutable?* is provided and set to `#f`.

**(list->char-set&#x20;*****char-list*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">\
\&#xNAN;**(list->char-set&#x20;*****char-list base-cs*****)**

Return a newly allocated mutable character set containing the characters in the list of characters *char-list*. If character set *base-cs* is provided, the characters from *base-cs* are added to it as well.

**(string->char-set&#x20;*****s*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">\
\&#xNAN;**(string->char-set&#x20;*****s base-cs*****)**

Return a newly allocated mutable character set containing the characters of the string *s*. If character set *base-cs* is provided, the characters from *base-cs* are added to it as well.

**(ucs-range->char-set&#x20;*****lower upper*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">\
\&#xNAN;**(ucs-range->char-set&#x20;*****lower upper base-cs*****)**\
\&#xNAN;**(ucs-range->char-set&#x20;*****lower upper limit base-cs*****)**

Returns a newly allocated mutable character set containing every character whose ISO/IEC 10646 UCS-4 code lies in the half-open range *\[lower,upper)*. *lower* and *upper* are exact non-negative integers where *lower <= upper <= limit* is required to hold. *limit* is either an exact non-negative integer specifying the maximum upper limit, or it is `#t` which specifies the maximum UTF-16 code unit value. If *limit* is not provided, a very large default is assumed (equivalent to *limit* being `#f`).

This signature is compatible with the SRFI 16 specification which states that if the requested range includes unassigned UCS values, these are silently ignored. If the requested range includes "private" or "user space" codes, these are passed through transparently. If any code from the requested range specifies a valid, assigned UCS character that has no corresponding representative in the implementation's character type, then

1. an error is raised if *limit* is `#t`, and
2. the code is ignored if *limit* is `#f` (the default).

If character set *base-cs* is provided, the characters of *base-cs* are included in the newly allocated mutable character set.

**(char-set-filter&#x20;*****pred cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">\
\&#xNAN;**(char-set-filter&#x20;*****pred cs base-cs*****)**

Returns a new character set containing every character *c* in character set *cs* such that *(pred c)* returns true. If character set *base-cs* is provided, the characters specified by *base-cs* are added to it.

**(->char-set&#x20;*****x*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Coerces object *x* into a character set. *x* may be a string, character or character set. A string is converted to the set of its constituent characters; a character is converted to a singleton character set; a character set is returned as is. This procedure is intended for use by other procedures that want to provide "user-friendly", wide-spectrum interfaces to their clients.

## Querying character sets

**(char-set-size&#x20;*****cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Returns the number of elements in character set *cs*.

**(char-set-count&#x20;*****pred cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Apply *pred* to the characters of character set *cs*, and return the number of characters that caused the predicate to return `#t`.

**(char-set->list&#x20;*****cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

This procedure returns a list of the characters of character set *cs*. The order in which *cs*'s characters appear in the list is not defined, and may be different from one call to another.

**(char-set->string&#x20;*****cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

This procedure returns a string containing the characters of character set *cs*. The order in which *cs*'s characters appear in the string is not defined, and may be different from one call to another.

**(char-set-hash&#x20;*****cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">\
\&#xNAN;**(char-set-hash&#x20;*****cs bound*****)**

Compute a hash value for the character set *cs*. *bound* is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range *\[0, bound)*. If *bound* is either zero or not given, a default value is used, chosen to be as large as it is efficiently practical.

## Character set algebra

**(char-set-adjoin&#x20;*****cs char ...*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Return a newly allocated mutable copy of *cs* into which the characters *char ...* were inserted.

**(char-set-delete&#x20;*****cs char ...*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Return a newly allocated mutable copy of *cs* from which the characters *char ...* were removed.

**(char-set-complement&#x20;*****cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Return a newly allocated character set containing all characters that are not contained in *cs*.

**(char-set-union&#x20;*****cs ...*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">\
\&#xNAN;**(char-set-intersection&#x20;*****cs ...*****)**\
\&#xNAN;**(char-set-difference&#x20;*****cs ...*****)**\
\&#xNAN;**(char-set-xor&#x20;*****cs ...*****)**\
\&#xNAN;**(char-set-diff+intersection&#x20;*****cs1 cs2 ...*****)**

These procedures implement set complement, union, intersection, difference, and exclusive-or for character sets. The union, intersection and xor operations are n-ary. The difference function is also n-ary, associates to the left (that is, it computes the difference between its first argument and the union of all the other arguments), and requires at least one argument.

Boundary cases:

```scheme
(char-set-union)          ⇒  char-set:empty
(char-set-intersection)   ⇒  char-set:full
(char-set-xor)            ⇒  char-set:empty
(char-set-difference cs)  ⇒  cs
```

`char-set-diff+intersection` returns both the difference and the intersection of the arguments, i.e. it partitions its first parameter. It is equivalent to `(values (char-set-difference cs1 cs2 ...) (char-set-intersection cs1 (char-set-union cs2 ...)))` but can be implemented more efficiently.

## Mutating character sets

**(char-set-adjoin!&#x20;*****cs char ...*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Insert the characters *char ...* into the character set *cs*.

**(char-set-delete!&#x20;*****cs char ...*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Remove the characters *char ...* from the character set *cs*.

**(char-set-complement!&#x20;*****cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Complement the character set *cs* by including all characters that were not contained in *cs* previously and by removing all previously contained characters.

**(char-set-union!&#x20;*****cs1 cs2 ...*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">\
\&#xNAN;**(char-set-intersection!&#x20;*****cs1 cs2 ...*****)**\
\&#xNAN;**(char-set-difference!&#x20;*****cs1 cs2 ...*****)**\
\&#xNAN;**(char-set-xor!&#x20;*****cs1 cs2 ...*****)**\
\&#xNAN;**(char-set-diff+intersection!&#x20;*****cs1 cs2 cs3 ...*****)**

These are update variants of the set-algebra functions mutating the first character set *cs1* instead of creating a new one. `char-set-diff+intersection!` will perform a side-effect on both of its two required parameters *cs1* and *cs2*.

**(char-set-filter!&#x20;*****pred cs base-cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Adds every character *c* in *cs* for which *(pred c)* returns `#t` to the given character set `base-cs`.

**(list->char-set!&#x20;*****char-list base-cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Add the characters from the character list *char-list* to character set *base-cs* and return the mutated character set *base-cs*.

**(string->char-set!&#x20;*****s base-cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Add the characters of the string *s* to character set *base-cs* and return the mutated character set *base-cs*.

**(ucs-range->char-set!&#x20;*****lower upper base-cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">\
\&#xNAN;**(ucs-range->char-set!&#x20;*****lower upper limit base-cs*****)**

Mutates the mutable character set *base-cs* including every character whose ISO/IEC 10646 UCS-4 code lies in the half-open range *\[lower,upper)*. *lower* and *upper* are exact non-negative integers where *lower <= upper <= limit* is required to hold. *limit* is either an exact non-negative integer specifying the maximum upper limit, or it is `#t` which specifies the maximum UTF-16 code unit value. If *limit* is not provided, a very large default is assumed (equivalent to *limit* being `#f`).

**(char-set-unfold!&#x20;*****f p g seed base-cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

This is a fundamental constructor for character sets.

* *g* is used to generate a series of "seed" values from the initial seed: seed, (g seed), (g2 seed), (g3 seed), ...
* *p* tells us when to stop by returning `#t` when applied to one of these seed values.
* *f* maps each seed value to a character. These characters are added to the base character set *base-cs* to form the result. `char-set-unfold!` adds the characters by mutating *base-cs* as a side effect.

## Iterating over character sets

**(char-set-cursor&#x20;*****cs*****)**   <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">\
\&#xNAN;**(char-set-ref&#x20;*****cs cursor*****)**\
\&#xNAN;**(char-set-cursor-next&#x20;*****cs cursor*****)**\
\&#xNAN;**(end-of-char-set?&#x20;*****cursor*****)**

Cursors are a low-level facility for iterating over the characters in a character set *cs*. A cursor is a value that indexes a character in a character set. `char-set-cursor` produces a new cursor for a given character set. The set element indexed by the cursor is fetched with `char-set-ref`. A cursor index is incremented with `char-set-cursor-next`; in this way, code can step through every character in a character set. Stepping a cursor "past the end" of a character set produces a cursor that answers true to `end-of-char-set?`. It is an error to pass such a cursor to `char-set-ref` or to `char-set-cursor-next`.

A cursor value may not be used in conjunction with a different character set; if it is passed to `char-set-ref` or `char-set-cursor-next` with a character set other than the one used to create it, the results and effects are undefined. These primitives are necessary to export an iteration facility for character sets to loop macros.

```scheme
(define cs (char-set #\G #\a #\T #\e #\c #\h))

;; Collect elts of CS into a list.
(let lp ((cur (char-set-cursor cs)) (ans '()))
  (if (end-of-char-set? cur) ans
      (lp (char-set-cursor-next cs cur)
          (cons (char-set-ref cs cur) ans))))
  ⇒  (#\G #\T #\a #\c #\e #\h)

;; Equivalently, using a list unfold (from SRFI 1):
(unfold-right end-of-char-set? 
              (curry char-set-ref cs)
	          (curry char-set-cursor-next cs)
	          (char-set-cursor cs))
  ⇒  (#\G #\T #\a #\c #\e #\h)
```

**(char-set-fold&#x20;*****kons knil cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

This is the fundamental iterator for character sets. Applies the function *kons* across the character set *cs* using initial state value *knil*. That is, if *cs* is the empty set, the procedure returns *knil*. Otherwise, some element *c* of *cs* is chosen; let *cs'* be the remaining, unchosen characters. The procedure returns `(char-set-fold kons (kons c knil) cs')`.

```scheme
; CHAR-SET-MEMBERS
(lambda (cs) (char-set-fold cons '() cs))
; CHAR-SET-SIZE
(lambda (cs) (char-set-fold (lambda (c i) (+ i 1)) 0 cs))
; How many vowels in the char set?
(lambda (cs) 
  (char-set-fold (lambda (c i) (if (vowel? c) (+ i 1) i)) 0 cs))
```

**(char-set-unfold&#x20;*****f p g seed*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">\
\&#xNAN;**(char-set-unfold&#x20;*****f p g seed base-cs*****)**

This is a fundamental constructor for character sets.

* *g* is used to generate a series of "seed" values from the initial seed: seed, (g seed), (g2 seed), (g3 seed), ...
* *p* tells us when to stop, when it returns `#t` when applied to one of these seed values.
* *f* maps each seed value to a character. These characters are added to a mutable copy of the base character set *base-cs* to form the result; *base-cs* defaults to an empty set.

More precisely, the following definitions hold, ignoring the optional-argument issues:

```scheme
(define (char-set-unfold p f g seed base-cs) 
  (char-set-unfold! p f g seed (char-set-copy base-cs)))

(define (char-set-unfold! p f g seed base-cs)
  (let lp ((seed seed) (cs base-cs))
    (if (p seed) cs                            ; P says we are done
        (lp (g seed)                           ; Loop on (G SEED)
            (char-set-adjoin! cs (f seed)))))) ; Add (F SEED) to set
```

Examples:

```scheme
(port->char-set p) = (char-set-unfold eof-object?
                                      values
                                      (lambda (x) (read-char p))
                                      (read-char p))
(list->char-set lis) = (char-set-unfold null? car cdr lis)
```

**(char-set-for-each&#x20;*****proc cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

Apply procedure *proc* to each character in the character set *cs*. Note that the order in which proc is applied to the characters in the set is not specified, and may even change from one procedure application to another.

**(char-set-map&#x20;*****proc cs*****)** <img src="https://1467949168-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fna2foeoaXHYkSD3fhs0t%2Fuploads%2Fgit-blob-d20368c588cfbb523beb2fae4f8be0f8ef011884%2Fproc.png?alt=media" alt="" data-size="line">

*proc* is a procedure mapping characters to characters. It will be applied to all the characters in the character set *cs*, and the results will be collected in a newly allocated mutable character set which will be returned by `char-set-map`.
