Docs – hyperverse
Sharing data across shiny modules, an update - Rtask
The R task Force - R experts for all your needs
I thought the moment was perfect to give y’all an update on the latest approaches I’ve been using to share data across {shiny} modules, along with some thoughts and comments on the "stratégie du petit r".
But I’m trying to keep things easy to maintain. Given the current size of the codebase, adding more layers or going deeper would make the code far more complex and harder to maintain with no real benefit. So yes, some of these modules are not perfect, and they might not be doing “just one thing”.
You know what they say: “perfect is the enemy of good.”
I’ve come to terms with this idea for two reasons:
Data frames are lists, and I don’t see any good reason to forbid passing a data.frame as an argument to a function.
JavaScript is full of functions that take scalar values and a list of parameters, and it works well.
For example, making an HTTP request in JS looks like this:
fetch(
"/api/users",
{
method: "GET",
headers: {
"Content-Type": "application/json",
"Accept": "application/json",
"Authorization": "Bearer YOUR_TOKEN",
}
}
)
Modules usually live in two scopes:
They do things within themselves
They do things that need to be passed to other modules
Doing things within themselves is pretty standard and doesn’t require a lot of thought (as long as you don’t forget the ns() 😅), but sharing things from one module to another in a reactive context can be more challenging
Passing reactive objects
One thing I’ve learned over the years is that what works for example apps can be a nightmare in a production context. The official Shiny docs recommend the following pattern: return one or more reactive() objects that can be passed to other modules.
If you feel like it’s a mess and complex to reason about, that’s because it is. And we’re in a simple case where data travels at the same depth in the stack.
As a side note, I think reactive() objects are conceptually neat, but I don’t think they should be your go-to building block.
It’s built in React, and it works just like {shiny} does (well, from a conceptual point of view): you have stateful objects, and when these objects change, they trigger another part of the app to be recomputed. In our case, whenever you interact with the first tab, the second tab (with the visualization) is updated.
To sum up, some objects are created at the top level and used to share data and trigger reactivity from one “module” to the other.
Note: my colleague Arthur pointed that Vue.js has something called store in Pinia. I’m not exactly sure how it works but apparently it’s more or less the same as reactiveValues. And Claude confirmed it 😄
THE “STRATÉGIE DU PETIT R”
One strategy we recommended is what we called the “stratégie du petit r”. Looking back, I can admit that it was a poor choice of name, but you know, sh*t happens.
The principle is quite simple: instead of returning and passing reactive() objects as arguments, you create one or more reactiveValues() at an upper level, which you then pass downstream to lower-level modules. reactiveValues() behave a lot like environments, meaning that values set down the stack are available everywhere.
I still think this is a valid way to share data, but only if you avoid applying it too literally and focus on how to work with it in practice.
The main criticism I’ve read about this approach is that you’ll end up with a huge r object with 300 entries in it, creating a monster that’s impossible to debug.
So yes, these monsters exist. But I don’t think the idea itself is the problem. It’s always easier to blame the tool than to acknowledge the lack of understanding behind its misuse. Or, as Beckett wrote, “Voilà l’homme tout entier, s’en prenant à sa chaussure alors que c’est son pied le coupable.” (“There’s man all over for you, blaming on his boots the faults of his feet.”)
Here are some random thoughts:
The corollary of the last point is simple: you need several reactiveValues(), operating at different scopes in your application.
STORAGE USING AN R6 OBJECT
One downside I can think of when using the reactiveValues() strategy I just described is that, well, it’s reactive, meaning it can lead to uncontrolled reactivity if things aren’t scoped correctly.
One pattern I’ve used in an app is combining an R6 object, used to store and process data, with the trigger mechanism from {gargoyle}. Basically, the idea behind {gargoyle} is simple: instead of relying on the reactive graph to invalidate itself, you init flags that are triggered in the code, and when a flag is triggered, the context where the flag is watched is invalidated.
It’s a bit longer to implement, but you get better control over what is happening.
Combined with this, you can use an R6 object that is passed along the modules, and that gets transformed to store, process, and serve the data.
You can read more about this in “15.1.3 Building triggers and watchers” and “15.1.4 Using R6 as data storage” in Chapter 15 of the Engineering Shiny book.
SESSION$USERDATA
This one should be used with a lot of caution, but it can be very effective if you know what you’re doing (and if you don’t have too many things to share).
The session object is an environment available everywhere in your Shiny app. It represents the current interaction between each user and the R session (i.e., each user has their own). This environment has a special slot called userData that can be populated with data, and it is scoped to the session.
The way I’ve used it in the past is via wrappers, which would look like:
set_this <- function(value, session = shiny::getDefaultReactiveDomain()){
session$userData$this <- compute_this(value)
}
get_this <- function(session = shiny::getDefaultReactiveDomain()){
session$userData$this
}
So anywhere I need it, I’ll use the wrapper function instead of session$userData$this. I would generally use it to define things at the top level that need to be accessible everywhere downstream, but I feel it might be a bit complex to manage if you need to pass data from mod_3_a to mod_3_g.
The documentation says it can be used “to store whatever session-specific data (we) want”, but my gut feeling is that it’s best not to shove too much into it. But I don’t have any rationale reason and I’d be happy to be proven wrong.
AN ENVIRONMENT IN THE SCOPE OF THE PACKAGE/TOP LEVEL OF THE APP
This is something a lot of R developers do: define an environment inside the package namespace so that, when the package is loaded, you can CRUD into it. For example, there are some (well, several) in {shiny}:
> shiny:::.globals
The function shinyOptions() writes to it, and getShinyOption() reads from it.
This pattern can be used as global storage, but be careful: it’s not session-scoped, so whatever is in this environment is shared across sessions.
AN EXTERNAL DATABASE OR STORAGE SYSTEM
Another solution is to store values in an external database, and query that DB inside modules.
If you try to implement this solution, two things to keep in mind are:
Make the data session-scoped, i.e., use session$token to identify the current session, and remove the data when the session ends.
You’ll need to handle reactivity manually, for example with {gargoyle}.
For example, with {storr}:
# Mimicking a session
session <- shiny::MockShinySession$new()
# In module 1
st <- storr::storr_rds(here::here())
st$set("dataset", mtcars, namespace = session$token)
# In module 2
st <- storr::storr_rds(here::here())
st$get("dataset", namespace = session$token)
Of course, this is a short piece of code and you’ll need more engineering, but you get the idea.
VRT -- GDAL Virtual Format — GDAL documentation
Azure Architecture Center - Azure Architecture Center
The Azure Architecture Center provides guidance for designing and building solutions on Azure by using established patterns and practices.
Geospatial Metadata Standards for Developers: The Complete Guide to Understanding, Implementing…
Everything you need to know about geospatial metadata — from why it matters to how the major standards work in practice.
The Arazzo Specification v1.0.1
The Arazzo Specification provides a mechanism that can define sequences of calls and their dependencies to be woven together and expressed in the context of delivering a particular outcome or set of outcomes when dealing with API descriptions (such as OpenAPI descriptions)
NCSS NCSP poster
AQP
TablesAndColumnsReport
Limsreport
Introduction to Vinyl Cache — Vinyl Cache
PawanRamaMali/shinyoats
Contribute to PawanRamaMali/shinyoats development by creating an account on GitHub.
Foundations of System Design
Data Acquisition
Learn everything about Data Acquisition, a key knowledge category of the GISCI Geospatial Core Technical Exam. Click to start studying.
Web Mapping Service (WMS): A WMS is a standard protocol developed by the Open Geospatial Consortium (OGC) in 1999.
Web Feature Service (WFS): A WFS provides essential tools for creating interactive maps with features like search capabilities, filtering, and sorting. Unlike WMS, a WFS gives access to vector data (not raster).
GeoServices REST Specification: The GeoServices REST Specification provides an open way for web clients to communicate with GIS servers by issuing requests to the server through structured URLs. The server responds with map images, text based geographic information, or other resources that satisfy the request.
S7 & Options objects | Josiah Parry
Update on mocking for testing R packages - R-hub blog
This blog featured a post on mocking, the art of replacing a function with whatever fake we need for testing, years ago. Since then, we’ve entered a new decade, the second edition of Hadley Wickham’s and Jenny Bryan’s R packages book was published, and mocking returned to testthat, so it’s time for a new take/resources roundup!
Thanks a lot to Hannah Frick for useful feedback on this post!
Taming gnarly nested data with purrr::modify_tree – Joe Kirincic
A tutorial for using purrr::modify_tree effectively.
tipg
Simple and Fast Geospatial OGC Features and Tiles API for PostGIS.
Yes, Postgres can do session vars - but should you use them?
Animated by some comments / complaints about Postgres’ missing user variables story on a Reddit post about PostgreSQL pain points in the real world - I thought I’d elaborate a bit on sessions vars - which is indeed a little known Postgres functionality. Although this “alley” has existed for ages...
The obvious and more well known SQL way to keep some transient state is via temp tables! They give some nice data type guarantees, performance, editor happiness to name a few benefits. But - don’t use them for high frequency use cases! A few temp tables per second might already be too much and a disaster might be waiting to happen…Because CREATE TEMP TABLE actually writes into system catalogs behind the scenes, which might not be directly obvious… And in cases of violent mis-use - think frequent, short-lived temp tables with a lot of columns, plus unoptimized and overloaded Autovacuum together with long-running queries - can lead to extreme catalog bloat (mostly on pg_attribute) and unnecessary IO for each session start / relcache filling / query planning. And it’s also hard to recover from without some full locking - so that for critical high velocity DB’s it might be a good idea to revoke temp table privileges altogether - for app / mortal users at least (not possible for superusers).
The 2nd most obvious way to keep some DB-side session state around would probably be to use more persistent normal tables, right? Already better than temp tables as no danger of bloating the system catalog, right? NO. Pushing transient data though WAL (including replicas and backup systems) is pretty bad and pointless and only to be recommended for tiny use cases.
In the Postgres world, exactly for these kinds of transient use cases, special UNLOGGED tables should be used! Which can relieve the IO pressure on the system / whole cluster considerably.
One of course just needs to account for the semi-persistent nature - and the fact that they won’t be private anymore. Meaning usage of RLS in case of secret data or just using some random enough keys to avoid collisions.
httpbin.org
Igloo
Discover how the sitrep() pattern simplifies R package maintenance and surfaces configuration errors in one go.
Stop playing “diagnostics ping-pong” with your users. This post explores why the _sitrep() (situation report) pattern — popularized by the usethis package — is a game-changer for R packages wrapping APIs or external software. Learn how to build structured validation functions that power both early error-handling and comprehensive system reports, featuring real-world implementation examples from the meetupr and freesurfer packages.
Geospatial API Fundamentals
Geospatial APIs enable seamless access to spatial data, powering mapping, analysis, and urban analytics with standardized operations and protocols.
Geospatial APIs are software abstraction layers that provide standardized methods to query, analyze, and visualize spatial data from diverse sources.
They support essential functionalities including 2D/3D map rendering, geocoding, coordinate transforms, and real-time sensor data integration.
Modern designs employ RESTful architectures and OGC standards to enhance interoperability, performance, and scalability across geospatial applications.
A geospatial Application Programming Interface (API) is a software abstraction layer—typically a web service or client library—that exposes standardized operations for querying, rendering, analyzing, and modeling spatial data, including vector features, raster coverages, multi-dimensional sensor observations, and geospatial attributes. Geospatial APIs are foundational for scientific computing, urban analytics, planetary research, public health surveillance, and geospatial-AI workflows, enabling programmable access to distributed spatial resources and seamless integration across data repositories, sensor infrastructures, visualization platforms, and analytic pipelines.
Adopting semantic types - Taxi
Learn how Taxi uses semantic typing to describe the meaning of data, not just its structure
Types are meant to be shared across systems, while models are system-specific. Your project structure should reflect this separation.
A well-implemented Taxi ecosystem has clear separation between shared semantics and system-specific implementations.
A mature implementation typically includes:
Shared Taxonomy
Collection of semantic types
Broadly shared across organization
Version controlled and carefully governed
Published as a reusable package
Service Implementations
Models and service definitions using types from taxonomy
System-specific structures
Published to TaxiQL server (like Orbital)
Each service depends on shared taxonomy
Data Consumers
Import shared taxonomy only
Don’t depend on service-specific models
Query data using TaxiQL
Receive data mapped to their needs
Best Practices
Type Development
Focus on business concepts
Keep types focused and single-purpose
Document type meanings clearly
Version types carefully
Model Development
Use semantic types for fields
Keep models service-specific
Don’t share models between services
Service Integration
Publish service contracts to TaxiQL server
Use semantic types in operation signatures
Let TaxiQL handle data mapping
Measuring Success
Your implementation is successful when:
Services can evolve independently
Data integration requires minimal code
New consumers can easily discover and use data
Changes to one service don’t cascade to others
Semantic meaning is preserved across systems
rspatialdata
We recently implemented an open-source outdoor map service. Buzzwords: serverless, vector tiles!
Maplibre styles in React, data-dependent styling, zoom-dependent styling, dynamic layers…
Production PostGIS Vector Tiles: Caching | Crunchy Data Blog
Building maps that use dynamic tiles from the database is a lot of fun. You get the freshest data, you don't have to think about generating a static tile set, and you can do it with very minimal middleware, using pg_tileserv.
Accelerating Spatial Postgres: Varnish Cache for pg_tileserv using Kustomize | Crunchy Data Blog
Rekha offers a step-by-step guide for deploying Varnish Cache for pg_tileserv using Kustomize and the Postgres Operator for Kubernetes on OpenShift.
Documentation | Enterprise PostgreSQL Support & Kubernetes Solutions | Crunchy Data
pg_featureserv
Because there are usually many functions in a Postgres database, the service only publishes functions defined in the schemas specified in the FunctionIncludes configuration setting. By default the functions in the postgisftw schema are published.
Welcome to blockr – blockr