Cool.
I have been spending a fair amount of time getting to know all the alternatives out there for vector tiles and will be looking very much forward to see what you guys comes up with. I will help out in any way I can. I will share some of the conclusions I have generated in this process. My background is mostly in software engineering and less in GIS, so use this as you see fit, i most likely wont know everything on this topic and you guys properly already spend som time thinking about it.
The thing I hate the most with the current vector tiles implementations is that it renders each lvl of detail as separate layers. I see the advantage ofcause, that it can be handled at tile generation time what features are shown at different lvls. But it also means that the client is rendering a vector layer for each lvl of detail.
I dont know if it will be possible, but instead of this vector layer per lvl of detail I have come to the conclusion that if I was designing it, I would use the resolution at the given camera position to determine which featuers are returned from the server.
This means that in case the view start at zoom lvl N, the client would have to ask for the tiles in lvl og detail N,N-1,…0 to get all features for the current view. Tile N could of cause have a byte telling how many lvls of details it would need to traverse back to get all features or simply a flag telling if it has any parent features.
This means that if starting at LOD 0 and starting zooming in, the client already have the large covering features and as zooming in, the new tiles only bring new features that have not been presented from parent tiles.
One could discuss if the server should do simplification on features. Its a valid point, but then when applying above, I think the client should be able to just update the already rendered feature if a less simplified feature is returned in a child tile.
On tile boundaries, I would apply similar as the quantized-mesh by clipping on tile boundaries and the client can be responsible of combining features with same id again. Polygons becomes linestrings ect, where there will be a point on the tile boundary if it extent one tile. But if we apply a technique as above, there would never be any need to clip features, because they can always be returned in the parent tile that has extent to contain the hole feature.
Having worked with geojson tiles, what seems to bring up the tile size is the amount of properties that we put into features these days. I believe that geometry and properties should be separated such the client can stream very light weight tiles for drawing the geometries, and then on demand or in separate tiles fetch properties since in many cases you really dont click or use this information. Also applying above technique with fetching all parent tiles, no features are split over multiply tiles and it is then possible to get the properties also for the same tiles without having to decide which tiles the properties goes in.
I am very interested in moving vector tiles forward and as soon as we can have a spec for these tiles, I can contribute with some backend stuff for the .net/mssql community (thats our area of focus). I will keep an eye out for the issue you mentioned.