From 6134702d25aa89a78a1028fc4e9fa67ab4b0d8b8 Mon Sep 17 00:00:00 2001 From: Bram van den Heuvel Date: Mon, 12 Feb 2024 18:54:58 +0100 Subject: [PATCH 1/3] Add Timeline documentation --- docs/timeline.md | 108 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 108 insertions(+) create mode 100644 docs/timeline.md diff --git a/docs/timeline.md b/docs/timeline.md new file mode 100644 index 0000000..8497f06 --- /dev/null +++ b/docs/timeline.md @@ -0,0 +1,108 @@ +# Timeline + +Given the complex nature of the Timeline design, it deserves some explanation of +the design. This document aims to describe how the Elm SDK designs the Timeline, +so that other projects may learn from it. + +## API endpoint disambiguations + +Generally speaking, there are a few API endpoints with similar design: + +- The [`/sync` endpoint](https://spec.matrix.org/v1.9/client-server-api/#get_matrixclientv3sync), +which gets the events that the homeserver received most recently. +- The [`/messages` endpoint](https://spec.matrix.org/v1.9/client-server-api/#get_matrixclientv3roomsroomidmembers), +which gets any events in the topological order. + +As noted in the Matrix spec: + +> Events are ordered in this API according to the arrival time of the event on +> the homeserver. This can conflict with other APIs which order events based on +> their partial ordering in the event graph. This can result in duplicate events +> being received (once per distinct API called). Clients SHOULD de-duplicate +> events based on the event ID when this happens. + +For this reason, the Elm SDK maintains **two independent timelines** that are tied +together when necessary to form a coherent timeline. + +## Elm design + +For those unfamiliar, the Elm Architecture breaks into three parts: + +- **Model** - the state of the application +- **View** - a way to turn your state into meaningful information +- **Update** - a way to update your state based on the Matrix API + +Since these concepts are compartmentalized, it is impossible to make an API call +while executing the **view** function; the Elm SDK must at all times find a way +to represent its state. + +## Timeline + +Concerning the Matrix timeline, it is meant to create a representation +(**Model**) of the timeline, find a way to represent (**View**) it, and find a +simple way to adjust it with every incoming Matrix API result. (**Update**) + +First, we define what a timeline batch is. + +### Timeline batch + +A timeline batch is something that most Matrix API endpoints return. It is a +little piece of the timeline and contains the following four pieces of +information: + +1. A list of events that are part of the timeline. +2. A Filter for which all provided events meet the criteria. +3. An end batch token that functions as an identifier. +4. _(Optional.)_ A start token. If not provided, it indicates the start of the + timeline. + +Here's an example of such a timeline batch: + +``` + |-->[■]->[■]->[●]->[■]->[■]->[●]-->| + | | + |<--- filter: only ■ and ● --->| + | | + start: end: + +``` + +When the Matrix API later returns a batch token that starts with ``, +we know that we can connect it to the batch above and make a longer list of +events! + +At first, this seems quite simple to connect, but there are some difficulties +that come up along the way. + +### Challenge 1: different filters, different locations + +When two timeline batches have different filters, we do not know their +respective location. For example, the following two timeline batches COULD +overlap, but it is also possible they don't: + +``` + |-->[■]->[■]->[●]->[■]->[■]->[●]-->| + | | + |<--- filter: only ■ and ● --->| + | | + start: end: + + + + |-->[★]->[★]->[★]->[★]-->| + | | + |<-- filter: only ★ -->| + | | + start: end: + +``` + +Realistically, there is currently no way of knowing without making more API +calls. However, just making more API calls isn't a solution in Elm because of +its architecture. + +> **SOLUTION:** As described in the **View** function, we may assume that +overlapping timeline batches have overlapping events. If they overlap yet have +no overlapping events, then their filters must be disjoint. If the filters are +disjoint, we do not care whether they're overlapping. + From 2d26e1826df7c05244defee21518a03ed97d0d06 Mon Sep 17 00:00:00 2001 From: Bram van den Heuvel Date: Tue, 13 Feb 2024 11:13:16 +0100 Subject: [PATCH 2/3] Add challenge 2 --- docs/timeline.md | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/docs/timeline.md b/docs/timeline.md index 8497f06..37fcba1 100644 --- a/docs/timeline.md +++ b/docs/timeline.md @@ -106,3 +106,33 @@ overlapping timeline batches have overlapping events. If they overlap yet have no overlapping events, then their filters must be disjoint. If the filters are disjoint, we do not care whether they're overlapping. +### Challenge 2: same filters, same spot + +Suppose there is a known timeline batch, and we're trying to **Update** the +timeline to represent the timeline between `` and `` for a +different filter: + +``` + |-->[■]->[■]->[●]->[■]->[■]->[●]-->| + | | + |<--- filter: only ■ and ● --->| + | | + start: end: + +``` + +If we wish to know what's in there for a different filter `f`, then: + +1. If `f` equals the filter from the timeline batch, we can copy the events. +2. If `f` is a subfilter of the batch filter (for example: `only ■`) then we can + copy the events from the given batch, and then locally filter the events + that do no match filter `f`. +3. If the batch filter is a subfilter of `f`, then we can use an API call + between the same batch tokens `` and ``. In the worst + case, we receive the exact same list of events. In another scenario, we + might discover far more events and receive some new batch value `` + in-between `` and ``. +4. If neither filter is a subfilter of the other and the two are (at least + partially) disjoint, then they do not need to correlate and any other batch + values can be chosen. + From cf28a3f2106b87bb25a66613b8e5406d314bf87b Mon Sep 17 00:00:00 2001 From: Bram van den Heuvel Date: Thu, 15 Feb 2024 01:27:00 +0100 Subject: [PATCH 3/3] Implement `mostRecentEvents` function --- src/Internal/Values/Timeline.elm | 239 +++++++++---------------------- 1 file changed, 70 insertions(+), 169 deletions(-) diff --git a/src/Internal/Values/Timeline.elm b/src/Internal/Values/Timeline.elm index f1ee19b..018b2fe 100644 --- a/src/Internal/Values/Timeline.elm +++ b/src/Internal/Values/Timeline.elm @@ -2,7 +2,7 @@ module Internal.Values.Timeline exposing ( Batch, Timeline , empty, singleton , mostRecentEvents - , addSync, insert + , insert , encode, decoder ) @@ -16,6 +16,29 @@ timeline is quite a complex data type, as it is constantly only partially known by the Matrix client. This module exposes a data type that helps explore, track and maintain this room state. +This design of the timeline uses the batches as waypoints to maintain an order. +The Matrix API often returns batches that have the following four pieces of +information: + +1. A list of events. +2. A filter for which all of the events meet the criteria. +3. An end batch token. +4. _(Optional)_ A start batch token. If it is not provided, it is the start of + the timeline. + +Here's an example of such a timeline batch: + + |-->[■]->[■]->[●]->[■]->[■]->[●]-->| + | | + |<-- filter: only ■ and ●, no ★ -->| + | | + start: end: + + +When the Matrix API later returns a batch token that starts with ``, +we know that we can connect it to the batch above and make a longer list of +events! + ## Batch @@ -47,8 +70,11 @@ import FastDict as Dict exposing (Dict) import Internal.Filter.Timeline as Filter exposing (Filter) import Internal.Tools.Hashdict as Hashdict exposing (Hashdict) import Internal.Tools.Iddict as Iddict exposing (Iddict) +import Internal.Tools.Json as Json import Json.Decode as D import Json.Encode as E +import Recursion +import Recursion.Traverse import Set exposing (Set) @@ -129,7 +155,7 @@ type Timeline { batches : Iddict IBatch , events : Dict String ( IBatchPTR, List IBatchPTR ) , filledBatches : Int - , mostRecentSync : ITokenPTR + , mostRecentBatch : ITokenPTR , tokens : Hashdict IToken } @@ -140,22 +166,6 @@ type alias TokenValue = String -{-| When syncing a Matrix room to its most recent state, add the most recent -batch to the front of the Timeline. --} -addSync : Batch -> Timeline -> Timeline -addSync batch timeline = - case insertBatch batch timeline of - ( Timeline tl, { start, end } ) -> - let - oldSync : ITokenPTR - oldSync = - tl.mostRecentSync - in - Timeline { tl | mostRecentSync = end } - |> connectITokenToIToken oldSync start - - {-| Append a token at the end of a batch. -} connectIBatchToIToken : IBatchPTR -> ITokenPTR -> Timeline -> Timeline @@ -236,158 +246,11 @@ empty = { batches = Iddict.empty , events = Dict.empty , filledBatches = 0 - , mostRecentSync = StartOfTimeline + , mostRecentBatch = StartOfTimeline , tokens = Hashdict.empty .name } -{-| Decode a Timeline from a JSON value. --} -decoder : D.Decoder Timeline -decoder = - D.map5 - (\batches events filled sync tokens -> - Timeline - { batches = batches - , events = events - , filledBatches = filled - , mostRecentSync = sync - , tokens = tokens - } - ) - (D.field "batches" <| Iddict.decoder decoderIBatch) - (D.map2 Tuple.pair - (D.field "head" decoderIBatchPTR) - (D.field "tail" <| D.list decoderIBatchPTR) - |> D.keyValuePairs - |> D.map Dict.fromList - |> D.field "events" - ) - (D.succeed 0) - (D.field "mostRecentSync" decoderITokenPTR) - (D.field "tokens" <| Hashdict.decoder .name decoderIToken) - |> D.map recountFilledBatches - - -decoderIBatch : D.Decoder IBatch -decoderIBatch = - D.map4 IBatch - (D.field "events" <| D.list D.string) - (D.field "filter" Filter.decoder) - (D.field "start" decoderITokenPTR) - (D.field "end" decoderITokenPTR) - - -decoderIBatchPTR : D.Decoder IBatchPTR -decoderIBatchPTR = - D.map IBatchPTR decoderIBatchPTRValue - - -decoderIBatchPTRValue : D.Decoder IBatchPTRValue -decoderIBatchPTRValue = - D.int - - -decoderIToken : D.Decoder IToken -decoderIToken = - D.map5 IToken - (D.field "name" decoderTokenValue) - (D.field "starts" <| D.map Set.fromList <| D.list decoderIBatchPTRValue) - (D.field "ends" <| D.map Set.fromList <| D.list decoderIBatchPTRValue) - (D.field "inFrontOf" <| D.map Set.fromList <| D.list decoderITokenPTRValue) - (D.field "behind" <| D.map Set.fromList <| D.list decoderITokenPTRValue) - - -decoderITokenPTR : D.Decoder ITokenPTR -decoderITokenPTR = - D.oneOf - [ D.map ITokenPTR decoderITokenPTRValue - , D.null StartOfTimeline - ] - - -decoderITokenPTRValue : D.Decoder ITokenPTRValue -decoderITokenPTRValue = - D.string - - -decoderTokenValue : D.Decoder TokenValue -decoderTokenValue = - D.string - - -{-| Encode a Timeline to a JSON value. --} -encode : Timeline -> E.Value -encode (Timeline tl) = - E.object - [ ( "batches", Iddict.encode encodeIBatch tl.batches ) - , ( "events" - , E.dict identity - (\( head, tail ) -> - E.object - [ ( "head", encodeIBatchPTR head ) - , ( "tail", E.list encodeIBatchPTR tail ) - ] - ) - (Dict.toCoreDict tl.events) - ) - , ( "mostRecentSync", encodeITokenPTR tl.mostRecentSync ) - , ( "tokens", Hashdict.encode encodeIToken tl.tokens ) - ] - - -encodeIBatch : IBatch -> E.Value -encodeIBatch batch = - E.object - [ ( "events", E.list E.string batch.events ) - , ( "filter", Filter.encode batch.filter ) - , ( "start", encodeITokenPTR batch.start ) - , ( "end", encodeITokenPTR batch.end ) - ] - - -encodeIBatchPTR : IBatchPTR -> E.Value -encodeIBatchPTR (IBatchPTR value) = - encodeIBatchPTRValue value - - -encodeIBatchPTRValue : IBatchPTRValue -> E.Value -encodeIBatchPTRValue = - E.int - - -encodeIToken : IToken -> E.Value -encodeIToken itoken = - E.object - [ ( "name", encodeTokenValue itoken.name ) - , ( "starts", E.set encodeIBatchPTRValue itoken.starts ) - , ( "ends", E.set encodeIBatchPTRValue itoken.ends ) - , ( "inFrontOf", E.set encodeITokenPTRValue itoken.inFrontOf ) - , ( "behind", E.set encodeITokenPTRValue itoken.behind ) - ] - - -encodeITokenPTR : ITokenPTR -> E.Value -encodeITokenPTR token = - case token of - ITokenPTR value -> - encodeITokenPTRValue value - - StartOfTimeline -> - E.null - - -encodeITokenPTRValue : ITokenPTRValue -> E.Value -encodeITokenPTRValue = - E.string - - -encodeTokenValue : TokenValue -> E.Value -encodeTokenValue = - E.string - - {-| Get an IBatch from the Timeline. -} getIBatch : IBatchPTR -> Timeline -> Maybe IBatch @@ -516,9 +379,47 @@ invokeIToken value (Timeline tl) = {-| Under a given filter, find the most recent events. -} -mostRecentEvents : Filter -> Timeline -> List String -mostRecentEvents _ _ = - [] +mostRecentEvents : Filter -> Timeline -> List (List String) +mostRecentEvents filter (Timeline timeline) = + mostRecentEventsFrom filter (Timeline timeline) timeline.mostRecentBatch + + +{-| Under a given filter, starting from a given ITokenPTR, find the most recent +events. +-} +mostRecentEventsFrom : Filter -> Timeline -> ITokenPTR -> List (List String) +mostRecentEventsFrom filter timeline ptr = + Recursion.runRecursion + (\p -> + case getITokenFromPTR p.ptr timeline of + Nothing -> + Recursion.base [] + + Just token -> + if Set.member token.name p.visited then + Recursion.base [] + + else + token.ends + |> Set.toList + |> List.filterMap (\bptrv -> getIBatch (IBatchPTR bptrv) timeline) + |> List.filter (\ibatch -> Filter.subsetOf ibatch.filter filter) + |> Recursion.Traverse.traverseList + (\ibatch -> + Recursion.recurseThen + { ptr = ibatch.start, visited = Set.insert token.name p.visited } + (\optionalTimelines -> + optionalTimelines + |> List.map + (\outTimeline -> + List.append outTimeline ibatch.events + ) + |> Recursion.base + ) + ) + |> Recursion.map List.concat + ) + { ptr = ptr, visited = Set.empty } {-| Recount the Timeline's amount of filled batches. Since the Timeline