Paul Joseph Davis

CouchDB View Index Protocol

Note:

This was part of an experiment I tried awhile back. There hasn't been any interest in including it in trunk and I don't really think it should be. So this is up for reference, but you probably shouldn't think about using this.

Outline

A quick document outlining the line protocol for the couchdb-view-index patch I wrote yesterday.

There are four types of interactions:

  1. Reset - Sent when the external process has been grabbed for new processing
  2. Index - Sent with view rows to add and delete from the index
  3. Delete - Sent when the view has been reset and things need to start over
  4. Query - Sent with url query string parameters to query the view.

Note

This is a line protocol. I have formatted everything here, but in real life all messages and responses (except the query response) must be a single line terminated by a newline character. The query response is dicussed below.

Reset

Message:

{"action": "reset"}

Response:

true

Index

Message:

{
    "action": "index",
    "db": "db_name",
    "group": "_design/foo",
    "views": ["bar", "baz"],
    "current_seq": 508,
    "new_seq": 602,
    "insert": [
        {"docid": "a", "key": 1, "value": "data here"},
        {"docid": "b", "key": 2, "value": null}
    ],
    "remove": [{"docid": "c", "key": 3}]
}

When implementing an indexer you should process the remove documents first to match the semantics of CouchDB's internal system.

Response:

true

Delete

Message:

{
    "action": "delete",
    "db": "foo",
    "group": "_design/bar",
    "current_seq": 6
}

Response:

true

Query

Message:

Given a url of something like:

curl 'http://127.0.0.1:5984/zing/_index/idx_type/foo/bar?q="my query"&count=19&skip=10'

Anything passed as a URL parameter will be sent to the index process. You must ensure that all query string parameters are valid JSON that can be decoded by CouchDB's JSON module.

{
    "action": "query",
    "db": "zing",
    "group": "_design/foo",
    "view": "bar",
    "query": {
        "q": "my query",
        "count": 19,
        "skip": 10
    }
}

Response:

This is where things get complicated. The query response comes in three stages:

  1. Initialization:
    • All is well:
      true
    • Or not well:
      [404, "missing", "not_found"]
  2. Streaming:
     {"total_rows": 10, "offset": 8, "rows": [
     {"id": "h_docid", "key": 8, "value": "foo"},
     {"id": "i_docid", "key": 9, "value": "bar"}
     {"id": "j_docid", "key": 10, "value": "baz"}
     ]}
    
    You can stream any free-form json that you want to end up at the client. You should probably support a minimum of the expected view output.
  3. Termination:
    \n\0\n
    Once you've sent the termination sequence you should not attempt to write anything to stdout until getting the next request.