-
Notifications
You must be signed in to change notification settings - Fork 760
v0.9 Spec Update: DateTimeInput Refinements and openUrl Function #497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Changes: - Removed outputFormat from DateTimeInput as it was underspecified. - Added min and max properties to DateTimeInput for ISO 8601 range constraints. - Added openUrl function to the standard catalog for handling links. - Updated schema definitions to support 'void' return type for functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively updates the v0.9 specification by refining DateTimeInput, adding a new openUrl function, and updating schemas to support a void return type. The changes are clear and well-aligned with the description.
I have a couple of suggestions to improve the JSON schemas for better validation by adding format hints for date-time and URI strings.
Additionally, a more significant point of feedback concerns the FunctionCall definition in common_types.json. While adding the void return type is necessary, retaining default: "boolean" for returnType could lead to runtime errors if a developer forgets to specify returnType: "void" for a function with side-effects like openUrl. I strongly recommend removing this default to enforce explicit return type declaration, which would make the protocol more robust. Since this change would affect an unchanged line in the diff, I'm highlighting it here for your consideration.
This is a good addition, but I'm wondering if we need to also include some hints about how to use it (i.e. with buttons): I don't want the LLM to go "Oooh, links! I know how to do Markdown links, so I'll add those to all my text components!". I mean, unless we want to also allow links in Markdown. I think if we do, that will require more complex configuration of the Text widget to specify the functions that the links activate, which might add a lot of complexity that people would have to duplicate on any custom text component. |
| "type": "string", | ||
| "description": "The expected return type of the function call.", | ||
| "enum": ["string", "number", "boolean", "array", "object", "any"], | ||
| "enum": ["string", "number", "boolean", "array", "object", "any", "void"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is "any" not sufficient for this? I mean, void is more explicit, and I like that, but it isn't necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry that not having a way to express a void return type explicitly creates confusion about expectations for LLMs and humans. E.g. imagine a catalog has a function any loadBooks() to trigger a "fire and forget" async load which doesn't actually return any data synchronously, and an "array getBooks()" function which returns books from the local database. I could imagine an agent building a layout that attempts to use loadBooks() as a parameter value to load data seeing as it may return something, while really it should use getBooks(). Having the void return type makes it super clear what the contract is.
I don't actually feel that strongly though - I'd be okay with 'any' if you feel strongly.
| "properties": { | ||
| "function": { | ||
| "type": "string", | ||
| "description": "A string specifying the local function to trigger (e.g., 'openUrl(${/url})'). This string can represent nested function calls and data model references." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to do function calls like this here, then we should probably make the FunctionCall type consistent with this. FunctionCall should just have one argument "call" that is this formatted string. This will require the same parsing and change registration as the string format function will.
Or we should make this consistent and just replace this entire definition with a FunctionCall definition, which would mean that the server action needs to have a discriminator field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One advantage of the second option is that it can be validated purely with JSON.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point. I actually kind of missed that FunctionCall was already defined, sorry.
I used it here - is this what you were imagining?
What do you think about the discriminator pattern to differentiate between functionCall (which is super clear, but adds a layer of nesting to the output) vs the "oneOf" style (more token efficient but less clear)?
| "outputFormat": { | ||
| "type": "string", | ||
| "description": "The desired format for the output string after a date or time is selected." | ||
| "min": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add regex constraints to these that force them to be data/time strings in UTC time zone:
e.g. ^\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}(?:\\.\\d{1,9})?Z$
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. It looks like JSON schema has something built-in for this, so I added it. It's a bit messy because these are actually DynamicStrings, but maybe it's fine.
Re markdown links: I actually think we should support them, and we sort of do implicitly already in the spec seeing as we support markdown. I don't think they need to have any relationship with the openUrl function I added here - renderers should just make markdown links work as best as they can, the same as they already do for plain non-A2UI markdown content. Re helping the LLM: Yes, that's fair! Do you think it should block this PR? I kind of want to just get the spec right here, and then we can figure out prompting strategies later. But also, this comment is also a warning sign that the spec is getting too complex. Do you think this is good complexity to add now? Or too much? Personally, I think it's a natural use of the Function concept you added and there is 1P customer demand which motivates me to do it now. |
This PR updates the A2UI v0.9 specification with the following changes:
outputFormatproperty as it was underspecified and better handled at the presentation layer. Addedminandmaxproperties to support date/time range constraints using ISO 8601 strings.openUrlfunction to the standard catalog to facilitate opening URLs (e.g., for links).common_types.jsonanda2ui_client_capabilities.jsonto includevoidas a valid function return type, supporting the newopenUrlfunction.actionconcept by moving the definition tocommon_types.json. It now supports:nameandcontext).functionproperty or string shorthand (e.g.,"action": "openUrl(${/url})").Buttoncomponent to use this new sharedActiontype.