Merge branch '3-timeline' of github.com:noordstar/elm-matrix-sdk-beta into 3-timeline

2024-02-15 11:34:25 +01:00 · 2024-02-15 11:34:25 +01:00 · 10c7075bef
parent cd8163bb41 cf28a3f210
commit 10c7075bef
2 changed files with 208 additions and 169 deletions
--- a/docs/timeline.md
+++ b/docs/timeline.md
@ -0,0 +1,138 @@
+# Timeline
+
+Given the complex nature of the Timeline design, it deserves some explanation of
+the design. This document aims to describe how the Elm SDK designs the Timeline,
+so that other projects may learn from it.
+
+## API endpoint disambiguations
+
+Generally speaking, there are a few API endpoints with similar design:
+
+- The [`/sync` endpoint](https://spec.matrix.org/v1.9/client-server-api/#get_matrixclientv3sync),
+which gets the events that the homeserver received most recently.
+- The [`/messages` endpoint](https://spec.matrix.org/v1.9/client-server-api/#get_matrixclientv3roomsroomidmembers),
+which gets any events in the topological order.
+
+As noted in the Matrix spec:
+
+> Events are ordered in this API according to the arrival time of the event on
+> the homeserver. This can conflict with other APIs which order events based on
+> their partial ordering in the event graph. This can result in duplicate events
+> being received (once per distinct API called). Clients SHOULD de-duplicate
+> events based on the event ID when this happens. 
+
+For this reason, the Elm SDK maintains **two independent timelines** that are tied
+together when necessary to form a coherent timeline.
+
+## Elm design
+
+For those unfamiliar, the Elm Architecture breaks into three parts:
+
+- **Model** - the state of the application
+- **View** - a way to turn your state into meaningful information
+- **Update** - a way to update your state based on the Matrix API
+
+Since these concepts are compartmentalized, it is impossible to make an API call
+while executing the **view** function; the Elm SDK must at all times find a way
+to represent its state.
+
+## Timeline
+
+Concerning the Matrix timeline, it is meant to create a representation
+(**Model**) of the timeline, find a way to represent (**View**) it, and find a
+simple way to adjust it with every incoming Matrix API result. (**Update**)
+
+First, we define what a timeline batch is.
+
+### Timeline batch
+
+A timeline batch is something that most Matrix API endpoints return. It is a
+little piece of the timeline and contains the following four pieces of
+information:
+
+1. A list of events that are part of the timeline.
+2. A Filter for which all provided events meet the criteria.
+3. An end batch token that functions as an identifier.
+4. _(Optional.)_ A start token. If not provided, it indicates the start of the
+    timeline.
+
+Here's an example of such a timeline batch:
+
+```
+       |-->[■]->[■]->[●]->[■]->[■]->[●]-->|
+       |                                  |
+       |<---   filter: only ■ and ●   --->|
+       |                                  |
+     start:                              end:
+    <token_1>                          <token_2>
+```
+
+When the Matrix API later returns a batch token that starts with `<token_2>`,
+we know that we can connect it to the batch above and make a longer list of
+events!
+
+At first, this seems quite simple to connect, but there are some difficulties
+that come up along the way.
+
+### Challenge 1: different filters, different locations
+
+When two timeline batches have different filters, we do not know their
+respective location. For example, the following two timeline batches COULD
+overlap, but it is also possible they don't:
+
+```
+       |-->[■]->[■]->[●]->[■]->[■]->[●]-->|
+       |                                  |
+       |<---   filter: only ■ and ●   --->|
+       |                                  |
+     start:                              end:
+    <token_1>                          <token_2>
+
+
+                |-->[★]->[★]->[★]->[★]-->|
+                |                          |
+                |<--   filter: only ★   -->|
+                |                          |
+              start:                      end:
+            <token_3>                    <token_4>
+```
+
+Realistically, there is currently no way of knowing without making more API
+calls. However, just making more API calls isn't a solution in Elm because of
+its architecture.
+
+> **SOLUTION:** As described in the **View** function, we may assume that
+overlapping timeline batches have overlapping events. If they overlap yet have
+no overlapping events, then their filters must be disjoint. If the filters are
+disjoint, we do not care whether they're overlapping.
+
+### Challenge 2: same filters, same spot
+
+Suppose there is a known timeline batch, and we're trying to **Update** the
+timeline to represent the timeline between `<token_1>` and `<token_2>` for a
+different filter:
+
+```
+       |-->[■]->[■]->[●]->[■]->[■]->[●]-->|
+       |                                  |
+       |<---   filter: only ■ and ●   --->|
+       |                                  |
+     start:                              end:
+    <token_1>                          <token_2>
+```
+
+If we wish to know what's in there for a different filter `f`, then:
+
+1. If `f` equals the filter from the timeline batch, we can copy the events.
+2. If `f` is a subfilter of the batch filter (for example: `only ■`) then we can
+    copy the events from the given batch, and then locally filter the events
+    that do no match filter `f`.
+3. If the batch filter is a subfilter of `f`, then we can use an API call
+    between the same batch tokens `<token_1>` and `<token_2>`. In the worst
+    case, we receive the exact same list of events. In another scenario, we
+    might discover far more events and receive some new batch value `<token_3>`
+    in-between `<token_1>` and `<token_2>`.
+4. If neither filter is a subfilter of the other and the two are (at least
+    partially) disjoint, then they do not need to correlate and any other batch
+    values can be chosen.
+
--- a/src/Internal/Values/Timeline.elm
+++ b/src/Internal/Values/Timeline.elm
@ -2,7 +2,7 @@ module Internal.Values.Timeline exposing
    ( Batch, Timeline
    , empty, singleton
    , mostRecentEvents
-    , addSync, insert
+    , insert
    , encode, decoder
    )

@ -16,6 +16,29 @@ timeline is quite a complex data type, as it is constantly only partially known
 by the Matrix client. This module exposes a data type that helps explore, track
 and maintain this room state.

+This design of the timeline uses the batches as waypoints to maintain an order.
+The Matrix API often returns batches that have the following four pieces of
+information:
+
+1.  A list of events.
+2.  A filter for which all of the events meet the criteria.
+3.  An end batch token.
+4.  _(Optional)_ A start batch token. If it is not provided, it is the start of
+    the timeline.
+
+Here's an example of such a timeline batch:
+
+           |-->[■]->[■]->[●]->[■]->[■]->[●]-->|
+           |                                  |
+           |<-- filter: only ■ and ●, no ★ -->|
+           |                                  |
+         start:                              end:
+        <token_1>                          <token_2>
+
+When the Matrix API later returns a batch token that starts with `<token_2>`,
+we know that we can connect it to the batch above and make a longer list of
+events!
+

 ## Batch

@ -47,8 +70,11 @@ import FastDict as Dict exposing (Dict)
 import Internal.Filter.Timeline as Filter exposing (Filter)
 import Internal.Tools.Hashdict as Hashdict exposing (Hashdict)
 import Internal.Tools.Iddict as Iddict exposing (Iddict)
+import Internal.Tools.Json as Json
 import Json.Decode as D
 import Json.Encode as E
+import Recursion
+import Recursion.Traverse
 import Set exposing (Set)


@ -129,7 +155,7 @@ type Timeline
        { batches : Iddict IBatch
        , events : Dict String ( IBatchPTR, List IBatchPTR )
        , filledBatches : Int
-        , mostRecentSync : ITokenPTR
+        , mostRecentBatch : ITokenPTR
        , tokens : Hashdict IToken
        }

@ -140,22 +166,6 @@ type alias TokenValue =
    String


-{-| When syncing a Matrix room to its most recent state, add the most recent
-batch to the front of the Timeline.
-}
-addSync : Batch -> Timeline -> Timeline
-addSync batch timeline =
-    case insertBatch batch timeline of
-        ( Timeline tl, { start, end } ) ->
-            let
-                oldSync : ITokenPTR
-                oldSync =
-                    tl.mostRecentSync
-            in
-            Timeline { tl | mostRecentSync = end }
-                |> connectITokenToIToken oldSync start
-
-
 {-| Append a token at the end of a batch.
 -}
 connectIBatchToIToken : IBatchPTR -> ITokenPTR -> Timeline -> Timeline
@ -236,158 +246,11 @@ empty =
        { batches = Iddict.empty
        , events = Dict.empty
        , filledBatches = 0
-        , mostRecentSync = StartOfTimeline
+        , mostRecentBatch = StartOfTimeline
        , tokens = Hashdict.empty .name
        }


-{-| Decode a Timeline from a JSON value.
-}
-decoder : D.Decoder Timeline
-decoder =
-    D.map5
-        (\batches events filled sync tokens ->
-            Timeline
-                { batches = batches
-                , events = events
-                , filledBatches = filled
-                , mostRecentSync = sync
-                , tokens = tokens
-                }
-        )
-        (D.field "batches" <| Iddict.decoder decoderIBatch)
-        (D.map2 Tuple.pair
-            (D.field "head" decoderIBatchPTR)
-            (D.field "tail" <| D.list decoderIBatchPTR)
-            |> D.keyValuePairs
-            |> D.map Dict.fromList
-            |> D.field "events"
-        )
-        (D.succeed 0)
-        (D.field "mostRecentSync" decoderITokenPTR)
-        (D.field "tokens" <| Hashdict.decoder .name decoderIToken)
-        |> D.map recountFilledBatches
-
-
-decoderIBatch : D.Decoder IBatch
-decoderIBatch =
-    D.map4 IBatch
-        (D.field "events" <| D.list D.string)
-        (D.field "filter" Filter.decoder)
-        (D.field "start" decoderITokenPTR)
-        (D.field "end" decoderITokenPTR)
-
-
-decoderIBatchPTR : D.Decoder IBatchPTR
-decoderIBatchPTR =
-    D.map IBatchPTR decoderIBatchPTRValue
-
-
-decoderIBatchPTRValue : D.Decoder IBatchPTRValue
-decoderIBatchPTRValue =
-    D.int
-
-
-decoderIToken : D.Decoder IToken
-decoderIToken =
-    D.map5 IToken
-        (D.field "name" decoderTokenValue)
-        (D.field "starts" <| D.map Set.fromList <| D.list decoderIBatchPTRValue)
-        (D.field "ends" <| D.map Set.fromList <| D.list decoderIBatchPTRValue)
-        (D.field "inFrontOf" <| D.map Set.fromList <| D.list decoderITokenPTRValue)
-        (D.field "behind" <| D.map Set.fromList <| D.list decoderITokenPTRValue)
-
-
-decoderITokenPTR : D.Decoder ITokenPTR
-decoderITokenPTR =
-    D.oneOf
-        [ D.map ITokenPTR decoderITokenPTRValue
-        , D.null StartOfTimeline
-        ]
-
-
-decoderITokenPTRValue : D.Decoder ITokenPTRValue
-decoderITokenPTRValue =
-    D.string
-
-
-decoderTokenValue : D.Decoder TokenValue
-decoderTokenValue =
-    D.string
-
-
-{-| Encode a Timeline to a JSON value.
-}
-encode : Timeline -> E.Value
-encode (Timeline tl) =
-    E.object
-        [ ( "batches", Iddict.encode encodeIBatch tl.batches )
-        , ( "events"
-          , E.dict identity
-                (\( head, tail ) ->
-                    E.object
-                        [ ( "head", encodeIBatchPTR head )
-                        , ( "tail", E.list encodeIBatchPTR tail )
-                        ]
-                )
-                (Dict.toCoreDict tl.events)
-          )
-        , ( "mostRecentSync", encodeITokenPTR tl.mostRecentSync )
-        , ( "tokens", Hashdict.encode encodeIToken tl.tokens )
-        ]
-
-
-encodeIBatch : IBatch -> E.Value
-encodeIBatch batch =
-    E.object
-        [ ( "events", E.list E.string batch.events )
-        , ( "filter", Filter.encode batch.filter )
-        , ( "start", encodeITokenPTR batch.start )
-        , ( "end", encodeITokenPTR batch.end )
-        ]
-
-
-encodeIBatchPTR : IBatchPTR -> E.Value
-encodeIBatchPTR (IBatchPTR value) =
-    encodeIBatchPTRValue value
-
-
-encodeIBatchPTRValue : IBatchPTRValue -> E.Value
-encodeIBatchPTRValue =
-    E.int
-
-
-encodeIToken : IToken -> E.Value
-encodeIToken itoken =
-    E.object
-        [ ( "name", encodeTokenValue itoken.name )
-        , ( "starts", E.set encodeIBatchPTRValue itoken.starts )
-        , ( "ends", E.set encodeIBatchPTRValue itoken.ends )
-        , ( "inFrontOf", E.set encodeITokenPTRValue itoken.inFrontOf )
-        , ( "behind", E.set encodeITokenPTRValue itoken.behind )
-        ]
-
-
-encodeITokenPTR : ITokenPTR -> E.Value
-encodeITokenPTR token =
-    case token of
-        ITokenPTR value ->
-            encodeITokenPTRValue value
-
-        StartOfTimeline ->
-            E.null
-
-
-encodeITokenPTRValue : ITokenPTRValue -> E.Value
-encodeITokenPTRValue =
-    E.string
-
-
-encodeTokenValue : TokenValue -> E.Value
-encodeTokenValue =
-    E.string
-
-
 {-| Get an IBatch from the Timeline.
 -}
 getIBatch : IBatchPTR -> Timeline -> Maybe IBatch
@ -516,9 +379,47 @@ invokeIToken value (Timeline tl) =

 {-| Under a given filter, find the most recent events.
 -}
-mostRecentEvents : Filter -> Timeline -> List String
-mostRecentEvents _ _ =
-    []
+mostRecentEvents : Filter -> Timeline -> List (List String)
+mostRecentEvents filter (Timeline timeline) =
+    mostRecentEventsFrom filter (Timeline timeline) timeline.mostRecentBatch
+
+
+{-| Under a given filter, starting from a given ITokenPTR, find the most recent
+events.
+-}
+mostRecentEventsFrom : Filter -> Timeline -> ITokenPTR -> List (List String)
+mostRecentEventsFrom filter timeline ptr =
+    Recursion.runRecursion
+        (\p ->
+            case getITokenFromPTR p.ptr timeline of
+                Nothing ->
+                    Recursion.base []
+
+                Just token ->
+                    if Set.member token.name p.visited then
+                        Recursion.base []
+
+                    else
+                        token.ends
+                            |> Set.toList
+                            |> List.filterMap (\bptrv -> getIBatch (IBatchPTR bptrv) timeline)
+                            |> List.filter (\ibatch -> Filter.subsetOf ibatch.filter filter)
+                            |> Recursion.Traverse.traverseList
+                                (\ibatch ->
+                                    Recursion.recurseThen
+                                        { ptr = ibatch.start, visited = Set.insert token.name p.visited }
+                                        (\optionalTimelines ->
+                                            optionalTimelines
+                                                |> List.map
+                                                    (\outTimeline ->
+                                                        List.append outTimeline ibatch.events
+                                                    )
+                                                |> Recursion.base
+                                        )
+                                )
+                            |> Recursion.map List.concat
+        )
+        { ptr = ptr, visited = Set.empty }


 {-| Recount the Timeline's amount of filled batches. Since the Timeline