Merge branch '3-timeline' of github.com:noordstar/elm-matrix-sdk-beta into 3-timeline

pull/17/head
Bram 2024-02-15 11:34:25 +01:00
commit 10c7075bef
2 changed files with 208 additions and 169 deletions

138
docs/timeline.md Normal file
View File

@ -0,0 +1,138 @@
# Timeline
Given the complex nature of the Timeline design, it deserves some explanation of
the design. This document aims to describe how the Elm SDK designs the Timeline,
so that other projects may learn from it.
## API endpoint disambiguations
Generally speaking, there are a few API endpoints with similar design:
- The [`/sync` endpoint](https://spec.matrix.org/v1.9/client-server-api/#get_matrixclientv3sync),
which gets the events that the homeserver received most recently.
- The [`/messages` endpoint](https://spec.matrix.org/v1.9/client-server-api/#get_matrixclientv3roomsroomidmembers),
which gets any events in the topological order.
As noted in the Matrix spec:
> Events are ordered in this API according to the arrival time of the event on
> the homeserver. This can conflict with other APIs which order events based on
> their partial ordering in the event graph. This can result in duplicate events
> being received (once per distinct API called). Clients SHOULD de-duplicate
> events based on the event ID when this happens.
For this reason, the Elm SDK maintains **two independent timelines** that are tied
together when necessary to form a coherent timeline.
## Elm design
For those unfamiliar, the Elm Architecture breaks into three parts:
- **Model** - the state of the application
- **View** - a way to turn your state into meaningful information
- **Update** - a way to update your state based on the Matrix API
Since these concepts are compartmentalized, it is impossible to make an API call
while executing the **view** function; the Elm SDK must at all times find a way
to represent its state.
## Timeline
Concerning the Matrix timeline, it is meant to create a representation
(**Model**) of the timeline, find a way to represent (**View**) it, and find a
simple way to adjust it with every incoming Matrix API result. (**Update**)
First, we define what a timeline batch is.
### Timeline batch
A timeline batch is something that most Matrix API endpoints return. It is a
little piece of the timeline and contains the following four pieces of
information:
1. A list of events that are part of the timeline.
2. A Filter for which all provided events meet the criteria.
3. An end batch token that functions as an identifier.
4. _(Optional.)_ A start token. If not provided, it indicates the start of the
timeline.
Here's an example of such a timeline batch:
```
|-->[■]->[■]->[●]->[■]->[■]->[●]-->|
| |
|<--- filter: only and --->|
| |
start: end:
<token_1> <token_2>
```
When the Matrix API later returns a batch token that starts with `<token_2>`,
we know that we can connect it to the batch above and make a longer list of
events!
At first, this seems quite simple to connect, but there are some difficulties
that come up along the way.
### Challenge 1: different filters, different locations
When two timeline batches have different filters, we do not know their
respective location. For example, the following two timeline batches COULD
overlap, but it is also possible they don't:
```
|-->[■]->[■]->[●]->[■]->[■]->[●]-->|
| |
|<--- filter: only and --->|
| |
start: end:
<token_1> <token_2>
|-->[★]->[★]->[★]->[★]-->|
| |
|<-- filter: only -->|
| |
start: end:
<token_3> <token_4>
```
Realistically, there is currently no way of knowing without making more API
calls. However, just making more API calls isn't a solution in Elm because of
its architecture.
> **SOLUTION:** As described in the **View** function, we may assume that
overlapping timeline batches have overlapping events. If they overlap yet have
no overlapping events, then their filters must be disjoint. If the filters are
disjoint, we do not care whether they're overlapping.
### Challenge 2: same filters, same spot
Suppose there is a known timeline batch, and we're trying to **Update** the
timeline to represent the timeline between `<token_1>` and `<token_2>` for a
different filter:
```
|-->[■]->[■]->[●]->[■]->[■]->[●]-->|
| |
|<--- filter: only and --->|
| |
start: end:
<token_1> <token_2>
```
If we wish to know what's in there for a different filter `f`, then:
1. If `f` equals the filter from the timeline batch, we can copy the events.
2. If `f` is a subfilter of the batch filter (for example: `only ■`) then we can
copy the events from the given batch, and then locally filter the events
that do no match filter `f`.
3. If the batch filter is a subfilter of `f`, then we can use an API call
between the same batch tokens `<token_1>` and `<token_2>`. In the worst
case, we receive the exact same list of events. In another scenario, we
might discover far more events and receive some new batch value `<token_3>`
in-between `<token_1>` and `<token_2>`.
4. If neither filter is a subfilter of the other and the two are (at least
partially) disjoint, then they do not need to correlate and any other batch
values can be chosen.

View File

@ -2,7 +2,7 @@ module Internal.Values.Timeline exposing
( Batch, Timeline
, empty, singleton
, mostRecentEvents
, addSync, insert
, insert
, encode, decoder
)
@ -16,6 +16,29 @@ timeline is quite a complex data type, as it is constantly only partially known
by the Matrix client. This module exposes a data type that helps explore, track
and maintain this room state.
This design of the timeline uses the batches as waypoints to maintain an order.
The Matrix API often returns batches that have the following four pieces of
information:
1. A list of events.
2. A filter for which all of the events meet the criteria.
3. An end batch token.
4. _(Optional)_ A start batch token. If it is not provided, it is the start of
the timeline.
Here's an example of such a timeline batch:
|-->[■]->[■]->[●]->[■]->[■]->[●]-->|
| |
|<-- filter: only and , no -->|
| |
start: end:
<token_1> <token_2>
When the Matrix API later returns a batch token that starts with `<token_2>`,
we know that we can connect it to the batch above and make a longer list of
events!
## Batch
@ -47,8 +70,11 @@ import FastDict as Dict exposing (Dict)
import Internal.Filter.Timeline as Filter exposing (Filter)
import Internal.Tools.Hashdict as Hashdict exposing (Hashdict)
import Internal.Tools.Iddict as Iddict exposing (Iddict)
import Internal.Tools.Json as Json
import Json.Decode as D
import Json.Encode as E
import Recursion
import Recursion.Traverse
import Set exposing (Set)
@ -129,7 +155,7 @@ type Timeline
{ batches : Iddict IBatch
, events : Dict String ( IBatchPTR, List IBatchPTR )
, filledBatches : Int
, mostRecentSync : ITokenPTR
, mostRecentBatch : ITokenPTR
, tokens : Hashdict IToken
}
@ -140,22 +166,6 @@ type alias TokenValue =
String
{-| When syncing a Matrix room to its most recent state, add the most recent
batch to the front of the Timeline.
-}
addSync : Batch -> Timeline -> Timeline
addSync batch timeline =
case insertBatch batch timeline of
( Timeline tl, { start, end } ) ->
let
oldSync : ITokenPTR
oldSync =
tl.mostRecentSync
in
Timeline { tl | mostRecentSync = end }
|> connectITokenToIToken oldSync start
{-| Append a token at the end of a batch.
-}
connectIBatchToIToken : IBatchPTR -> ITokenPTR -> Timeline -> Timeline
@ -236,158 +246,11 @@ empty =
{ batches = Iddict.empty
, events = Dict.empty
, filledBatches = 0
, mostRecentSync = StartOfTimeline
, mostRecentBatch = StartOfTimeline
, tokens = Hashdict.empty .name
}
{-| Decode a Timeline from a JSON value.
-}
decoder : D.Decoder Timeline
decoder =
D.map5
(\batches events filled sync tokens ->
Timeline
{ batches = batches
, events = events
, filledBatches = filled
, mostRecentSync = sync
, tokens = tokens
}
)
(D.field "batches" <| Iddict.decoder decoderIBatch)
(D.map2 Tuple.pair
(D.field "head" decoderIBatchPTR)
(D.field "tail" <| D.list decoderIBatchPTR)
|> D.keyValuePairs
|> D.map Dict.fromList
|> D.field "events"
)
(D.succeed 0)
(D.field "mostRecentSync" decoderITokenPTR)
(D.field "tokens" <| Hashdict.decoder .name decoderIToken)
|> D.map recountFilledBatches
decoderIBatch : D.Decoder IBatch
decoderIBatch =
D.map4 IBatch
(D.field "events" <| D.list D.string)
(D.field "filter" Filter.decoder)
(D.field "start" decoderITokenPTR)
(D.field "end" decoderITokenPTR)
decoderIBatchPTR : D.Decoder IBatchPTR
decoderIBatchPTR =
D.map IBatchPTR decoderIBatchPTRValue
decoderIBatchPTRValue : D.Decoder IBatchPTRValue
decoderIBatchPTRValue =
D.int
decoderIToken : D.Decoder IToken
decoderIToken =
D.map5 IToken
(D.field "name" decoderTokenValue)
(D.field "starts" <| D.map Set.fromList <| D.list decoderIBatchPTRValue)
(D.field "ends" <| D.map Set.fromList <| D.list decoderIBatchPTRValue)
(D.field "inFrontOf" <| D.map Set.fromList <| D.list decoderITokenPTRValue)
(D.field "behind" <| D.map Set.fromList <| D.list decoderITokenPTRValue)
decoderITokenPTR : D.Decoder ITokenPTR
decoderITokenPTR =
D.oneOf
[ D.map ITokenPTR decoderITokenPTRValue
, D.null StartOfTimeline
]
decoderITokenPTRValue : D.Decoder ITokenPTRValue
decoderITokenPTRValue =
D.string
decoderTokenValue : D.Decoder TokenValue
decoderTokenValue =
D.string
{-| Encode a Timeline to a JSON value.
-}
encode : Timeline -> E.Value
encode (Timeline tl) =
E.object
[ ( "batches", Iddict.encode encodeIBatch tl.batches )
, ( "events"
, E.dict identity
(\( head, tail ) ->
E.object
[ ( "head", encodeIBatchPTR head )
, ( "tail", E.list encodeIBatchPTR tail )
]
)
(Dict.toCoreDict tl.events)
)
, ( "mostRecentSync", encodeITokenPTR tl.mostRecentSync )
, ( "tokens", Hashdict.encode encodeIToken tl.tokens )
]
encodeIBatch : IBatch -> E.Value
encodeIBatch batch =
E.object
[ ( "events", E.list E.string batch.events )
, ( "filter", Filter.encode batch.filter )
, ( "start", encodeITokenPTR batch.start )
, ( "end", encodeITokenPTR batch.end )
]
encodeIBatchPTR : IBatchPTR -> E.Value
encodeIBatchPTR (IBatchPTR value) =
encodeIBatchPTRValue value
encodeIBatchPTRValue : IBatchPTRValue -> E.Value
encodeIBatchPTRValue =
E.int
encodeIToken : IToken -> E.Value
encodeIToken itoken =
E.object
[ ( "name", encodeTokenValue itoken.name )
, ( "starts", E.set encodeIBatchPTRValue itoken.starts )
, ( "ends", E.set encodeIBatchPTRValue itoken.ends )
, ( "inFrontOf", E.set encodeITokenPTRValue itoken.inFrontOf )
, ( "behind", E.set encodeITokenPTRValue itoken.behind )
]
encodeITokenPTR : ITokenPTR -> E.Value
encodeITokenPTR token =
case token of
ITokenPTR value ->
encodeITokenPTRValue value
StartOfTimeline ->
E.null
encodeITokenPTRValue : ITokenPTRValue -> E.Value
encodeITokenPTRValue =
E.string
encodeTokenValue : TokenValue -> E.Value
encodeTokenValue =
E.string
{-| Get an IBatch from the Timeline.
-}
getIBatch : IBatchPTR -> Timeline -> Maybe IBatch
@ -516,9 +379,47 @@ invokeIToken value (Timeline tl) =
{-| Under a given filter, find the most recent events.
-}
mostRecentEvents : Filter -> Timeline -> List String
mostRecentEvents _ _ =
[]
mostRecentEvents : Filter -> Timeline -> List (List String)
mostRecentEvents filter (Timeline timeline) =
mostRecentEventsFrom filter (Timeline timeline) timeline.mostRecentBatch
{-| Under a given filter, starting from a given ITokenPTR, find the most recent
events.
-}
mostRecentEventsFrom : Filter -> Timeline -> ITokenPTR -> List (List String)
mostRecentEventsFrom filter timeline ptr =
Recursion.runRecursion
(\p ->
case getITokenFromPTR p.ptr timeline of
Nothing ->
Recursion.base []
Just token ->
if Set.member token.name p.visited then
Recursion.base []
else
token.ends
|> Set.toList
|> List.filterMap (\bptrv -> getIBatch (IBatchPTR bptrv) timeline)
|> List.filter (\ibatch -> Filter.subsetOf ibatch.filter filter)
|> Recursion.Traverse.traverseList
(\ibatch ->
Recursion.recurseThen
{ ptr = ibatch.start, visited = Set.insert token.name p.visited }
(\optionalTimelines ->
optionalTimelines
|> List.map
(\outTimeline ->
List.append outTimeline ibatch.events
)
|> Recursion.base
)
)
|> Recursion.map List.concat
)
{ ptr = ptr, visited = Set.empty }
{-| Recount the Timeline's amount of filled batches. Since the Timeline