5.7 KiB
Filters need more specification
Given that all data in Elm is immutable, it is VERY important that we shape our data model the right way.
The Matrix spec doesn't seem sufficiently clear on how certain endpoints cooperate with the usage of filters, however, and this may raise some misrepresentation.
I have no familiarity with the Server-Server API and I'm basing my perspective of the timeline purely on the linear timeline as presented in the Client-Server API. Section 7.6 Syncing suggests that batch tokens can be seen as marked waypoints between two events, and that the timeline can be seen as one with a strict total ordering.
Filters and endpoints
Currently, three endpoints support filters:
/sync
The /sync
endpoint gets you the latest events in the timeline, as long as
they match the criteria of the filter. From my understanding, the endpoint is
defined as follows:
As you can see:
- With no filter, the endpoint is clear.
- With a filter, the endpoint is clear if the most recent event on the timeline meets the filter's criteria.
- With a filter, the endpoint is NOT clear if the most recent event doesn't meet the filter's criteria.
There are points to be made that the next_batch
token is set at the end of
the timeline, but it can also make sense to return the next_batch
token at
the most recent event that matches the filter.
The spec doesn't seem to suggest either.
/messages
The /messages
endpoint is a little trickier, and some of the inputs aren't
exactly clear. What should happen when the user inserts invalid input?
When asking people in the Matrix spec channel:
-
Some have argued that the endpoint should return no events, as the homeserver should stop iteration once it has passed the
to
token. -
Some have implied that the endpoint should iterate until it has reached any of the limits, as the batch tokens are opaque and homeservers shouldn't be expected to know the relative position of two tokens.
However, when using filters, another issue rises of where tokens should start and end:
As can be seen, the spec doesn't seem to verify where the end
token should
point to. For the circles only filter, there's an argument to be made to
put the end
batch token right after the last event: that way, we wouldn't
skip the next square and star event in case we switch to a different
filter.
/context
If we jump to an event on the timeline, we are able to get the context of the event and see what events have been sent around the same time.
At first, the issues may seem similar to the ones presented in the /messages
endpoint. However, the /context
endpoint has the major disadvantage that it
doesn't show the relative location of the endpoint on the timeline.
An example
Suppose we joined a public room yesterday, then turned off our client during
the night, and turned it back on today. During the night, some people sent so
many events that the /sync
endpoint has announced a gap to us this morning.
However, in one of the most recent events, one of the room members replies to
some event in the past! Luckily, we can use /context
to jump to that event -
but where in the timeline is this event located? Was this event sent last
night, or before we joined the room yesterday?
Since batch tokens are opaque values, we as the client cannot use them to determine where the messages is located relative to the timeline that we're familiar to. Or can we?
This behaviour heavily depends on how /messages
works on undefined values:
-
If the endpoint stops as soon as it's passed the
to
token, then one can take two batch tokens (e.g.batch_token_1
andbatch_token_5
) and call the endpoint once in both directions. (Backwards and forwards) One of two will return an empty list of events, which hints at the relative position of the two tokens. -
If the endpoint only stops at the
to
token, then the only way to determine the relative position of the event is to keep paginating/messages
in either direction until you hit familiar events. (Unrelated note: this can be improved by picking a filter as specific as possible that eventually hits one of our familiar events.)
To summarize
At first, I wrote an issue for a spec clarification on this, but now it seems that it's necessary to write an MSC about it. I'd like to get feedback though, so here's an open letter to all interested people first!
The MSC would probably be a request to clarify filtering in the spec. It won't be just a clarification though, as it would mean setting so many specifics that it's likely at least one client will not have implemented them accordingly.