Altering our semver commitment so we can ship "Experimental" APIs and features

This is just a proposal I have for discussion and for which I would like more feedback.

The general idea is that we could start shipping experimental features or API in otherwise stable releases:

  • Experimental parts of the codebase could be mark with an @expiremental flag (or something similar) and/or be disabled by default, requiring developers to opt-in into using it by modifying their YML config.
  • Our SEMVER commitments would not apply to experimental APIs. Those APIs could be change or removed completely in a future miner releases.
  • Experimental features would be announced in a dedicated section of the changelogs. This sections would also be used to notify users of experimental API that have changes or been removed.
  • Features/API would only be in the “experimental” stage for a short time. They either get remove from Core or become officially supported within a the time frame of one minor release or two. (The idea here is to avoid a scenario where we ended up with dozens of long live experimental APIs in our codebase)

What’s the rational for this

Because module like framework or asset are officially supported, we’re understandably wary about adding new API to them because we’re on the hook for supporting these API long term. This limits our ability to quickly iterate over proposed changes and require us to spend a lot of time upfront validating that a new feature/API has value before shipping it.

This slows down core development considerably.

Works on unsupported/not-supported-yet module in comparison is much more fast pace and allows us much more leeway for mistakes and for changing our minds after the fact.

A good example of this approach is browser vendors that will ship experimental or incomplete features in otherwise stable releases. This allows real life users to help test and refine the feature.

Example of features that could be good example to be shipped on a “experimental” basis

Further questions

  • If you’re developing Silverstripe CMS projects, would you be keen to try out experimental features? In a sandbox environment? In production?
  • As core developer, would you be keen to get features shipped this way? Do you foresee any potential problems?
  • Does shipping “experimental” code in a stable release create potential communication problems or confusion?
  • How would we decide what features are good candidate to be shipped as experiments?
  • How would we collect results and data from experimental features?
  • How would we decide what happens with experimental code in the long run?
  • How do we mark a feature as experimental?

Here’s some variations on the general idea.

Only experiment with features we are confident we want to support

Maybe we only do experiments with feature we are ~95% confident we want to keep long term. This could give site owners more confidence to use the experiment, while giving is the opportunities to tweak the feature later on. This could have been useful when developing features like the “React Search bar” or the “DataObject hydration logic” where we ended up tweaking substantially after releasing them.

Limit the number of active experiments

Maybe we capped the number of active experiments allowed in the codebase at any given time. This would give us an incentive to make decision about experiments and avoid having them hang in the code base indefinitely.

Have two class of experiments

Similar to the Only experiment with features we are confident we want to support idea, we could have another class of experiments that are more edgy and for which we are not confident that we want to include in the codebase.

This could lead to a natural progression for our experiments where they start off as a “edgy proof-of-concept”. If we decide we want to keep the experiment, it then progress to the “confident we want to keep this” stage in the next release. Once we’ve worked out all the kinks, we then move to full semver support.

Warn users if they use experimental code

If we’re concerned people will unwittingly run experimental code, we could throw a special exception when experimental code is run without a ALLOW_EXPERIMENTS environment variable.

Hey Max, thanks for raising this!

We already mark some APIs as @internal, and use “experimental” as terminology in PHPDoc. @internal is actually defined in PHPDoc, which has the benefits of IDE support (e.g. strikethrough of method usage in PHPStorm). I don’t think we should use an undefined @experimental tag. For all intents and purposes, experimental APIs are “internal” until we decide that they’re not (making it part of the API surface).

Potentially for people with less powerful IDEs, but I’m inclined to say if you’re a developer using an API it’s up to you to determine proper use. I’m keen to have some kind of “advisor script” which flags usage of @deprecated API in your code (through PHPStan?), which then could also pick up @internal. You can make your own judgement as a developer if those warnings are relevant to your context (project code vs. community module vs. core module).

That’s really the kicker: If you encapsulate experimental APIs as new modules, you can at least track installations of that module. We have no meaningful way to do that with code usage (Github code searches are too inaccurate for this in most cases). I’d rather turn this problem around and see if you can get enough community interest before building an experimental API. We don’t really have good channels to ask for that kind of feedback, but could create those. E.g. by having a more dev centric twitter account, a “this month in Silverstripe” newsletter, etc

Even if we narrowed analysis to sites running on our platforms, the level of static analysis required to do this programmatically would be too great IMHO, since we can’t control type hinting in those codebases.

Another point of complexity: APIs are often required to drive value for authors rather than developers (like your “smarter search for GridField” example). In that case, I don’t think a lack of opt-in by devs for experimental features is the right way to evaluate this value. Either we identify it as useful based on user research, or we don’t.

So in my mind, this is the rough decision tree:

  • Valuable user feature on core UI relies on this API? Make it stable, or don’t build it
  • Can be validated as a module without adding complexity to the API or its relation to existing core APIs? Start as a module, actively promote its use, and monitor install stats
  • High confidence, needs to be a core API? Mark as @internal with a note about the experimental nature. Actively promote it to devs, and perform qualitative research to find out uptake
  • Medium confidence, needs to be a core API? Start with an RFC, encourage validation there (or run some experimental forks)
  • Low confidence, needs to be a core API? Don’t bother.