Tech Sessions: Dispatcher Configurations in ÃÛ¶¹ÊÓÆµ Experience Manager as a Cloud Service
ÃÛ¶¹ÊÓÆµ Experience Manager (AEM) as a Cloud Service offers scalability, flexibility, and improved performance for modern digital experience platforms. At the heart of this architecture lies the AEM Dispatcher—a vital component responsible for caching, security, and request management. When properly configured, the Dispatcher accelerates content delivery, safeguards backend systems, and boosts overall site performance.
This overview highlights key Dispatcher settings, including caching strategies, access control mechanisms, and request filtering. It also outlines best practices for maintaining a secure and high-performing AEM deployment in the cloud. Whether you’re a developer, architect, or business decision-maker, a solid understanding of Dispatcher configurations is crucial to unlocking the full potential of AEM as a Cloud Service.
Hello and welcome everyone. Happy Tuesday. Thank you for joining today’s tech sessions. My name is Yojuan and just a couple of housekeeping notes before I let the presenter take it away. We encourage and hope you guys ask questions throughout the presentation in the Q&A chat available in the right corner where the presenter is speaking. However, rest assured that the last five minutes or so of the presentation are simply dedicated to answering any of your additional questions. Additionally, a link to this recording will be emailed to all of you in 24 hours and will be available in our ÃÛ¶¹ÊÓÆµ Experience League website should you wish to view it again or share it with any of your colleagues. I’ll give us a second in case there’s any additional questions. All right, there’s no further questions. I’m going to go ahead and pass it to our speaker. Thank you everyone and enjoy.
Hi everyone. My name is Patrick. I’ll be hosting today’s tech talk with respect to the dispatcher and how we can leverage our existing tooling to deploy its configurations inside of AIMS cloud service. Let’s kick it off. A little bit about myself. I have over four years of support inside of the Experience cloud. I’ve been working also with various partners before in the ÃÛ¶¹ÊÓÆµ technology realm with various partners working on forms, sites, assets, and now especially with AIM as a cloud service. When I’m not working on AIM stuff, I enjoy making food and listening to music and occasionally some programming. Okay, so today’s agenda. We’re going to do a quick overview, functionality and usage, self troubleshooting, a few live demos, we’re going to mix those in and then we’ll do common pitfalls and best practices.
All right, so let’s start with the basics. So what is the dispatcher? So the dispatcher is a caching and load balancing tool. Before CDNs and all that kind of stuff, the dispatcher was the primary way to cache and actually balance traffic across AIM instances. AIM is a solution that was invented in the 1990s and before then CDNs weren’t really a thing then. So there was a module about that that was built to plug into the HTTP server known as the dispatcher. We use that to this day inside of AIM as a cloud service. We also use this module to actually protect against any security issues. We can use filters inside of the dispatcher to manage what we can access inside of AIM. So using this, we have the ability to cache, we have the ability to load balance, and we have the ability to protect against certain types of requests, to protect against users getting access to sensitive information, and so on and so forth.
Now as part of today’s session, we’re going to be talking about how we can build these configurations. What tooling can we use to generate these configurations? We don’t want to have to build these manually. We’re also going to talk about how we can actually validate and debug these configurations. There should be a way and we should be able to leverage these so that we can seamlessly create these configurations and deploy them. We’re going to be talking about RDEs, right, rapid development environments and how we can leverage those to actually quickly test and validate our configurations. Different types of pipelines, we have the Web tier pipeline that we can leverage. That will allow us to deploy our dispatcher configurations as well. Lastly, we’re going to mention a little bit about advanced networking, which is a way that we allow users to connect to external services, either using VPN, dedicated egress IP, using the dispatcher and how that can help alleviate some transactions on the JVM, known as AEM per se.
The very first thing that we want to talk about is leveraging the AEM archetype. The archetype is there to help you build your project. It has a bunch of examples and it allows you to build your project for either cloud or even AMS-based configurations. In this slide here, we actually showcase the command that we can use to generate an AEM as a cloud service project, which has the default configurations for your dispatcher in there. It’s built in a way that actually has the file structure that we, ÃÛ¶¹ÊÓÆµ, recommend for you to use. That will allow you to build and scale your dispatcher configurations in an effective manner. We break those apart into specific files. We have the client header files, which allows us to manage what headers are allowed to be sent from the dispatcher to AEM. We have the virtual hosts inside the dispatcher file, which allows us to pick which farm we want to use. We have all sorts of other configurations that are built in a certain way that allows you to individually manage those files with respect to those specific configurations so that you don’t have a massive file that becomes hard to manage. You can have individual files. Remember, we’re also using this inside of Git. All these configurations are going to be stored inside of a Git repository. If you have the files that are being managed by specific users, you have atomic files that are specific to those configurations that you can then manage. When you complete building the archetype, what you’ll see at the end is a build success message, which then allows you to actually go in and take a look at the configurations.
Now, once we have these configurations built, how do we test these? We know that doing pipelines is one way. You can do a CI CD pipeline and push content and push configurations and all that kind of stuff up there, but those still do take time. If it takes you 30 minutes, 45 minutes to do these validations, it’s an expensive operation. If you do a few changes a day, you want to do some tests quickly, if you have to wait 45 minutes to get the result that you’re looking for, you’re limited to the amount of tests that you can do. The engineering team developed an SDK similarly to how there’s one for AEM. There’s the AEM SDK, which allows you to deploy your configurations. There’s also a dispatcher SDK, which comes with the AEM SDK when you download it from experience.w.com. There’s a few prerequisites, but if you can get those prerequisites, then you can quickly run some CLI commands to automate and test to see if your configurations are compatible and will work as expected. How do you get this? You go to experience.w.com, you download the SDK, and when you unzip it, you’ll see that there’s an actual validation file, immutable file checks, and all sorts of shell scripts in there that I’ll walk you through in a quick little demo on how we can actually use these as a validation tool. So once you download that, you’ll need the Docker runtime. This is just used because we set up a quick little Docker instance of the HTTP server, where then we throw those configurations into and then validate them inside of that runtime, giving you a quick and seamless HTTP configuration check. Now, I’ve also put the commands on the third bullet point, and so what we need to do in order to actually extract and start the SDK. So once you’ve expanded the AEM SDK, you’ll see that there’ll be a dispatchers tools. There’s one for UNIX and there’s one for Windows, and all you need to do is you need to expand it and then set the necessary runtime permissions, as you see in the third bullet point. Now, in the last bullet point, we run the validate command. The validate command is what’s going to actually go through your configurations and check them one by one to make sure that they are compatible. You’ll notice that there’s two commands. There’s the validate and there’s an immutability check. The immutability check is used to make sure that any of the immutable files have not been changed. That’s really important because if you try to make a change to an immutable file, it will not be deployed. And there’s a reason why we have immutable files. There are certain assumptions that we make that the HTTP server or the dispatcher needs to adhere to. And so how do we make sure that those are met? We have certain files that cannot be changed, which should contain some of our configurations. That guarantees a few things. And so we have an immutability check that you can run to make sure that there’s no changes that have been done to files that we don’t expect them to be changed. Similarly, there’s a validate file. The validation file will actually show us if there’s any issues. So as part of that, let’s do a quick session.
Share my screen.
And so what you’ll see here is I actually have some configuration files, right? I have my SDK expanded with my files inside of there. I have my dispatcher configurations set up.
I made some modification to the client headers and I want to make sure that these are actually working as expected. So what I do is I go in and I’ll run the validate script, just right here. I pass in the location of my dispatcher configuration files and I can actually run them. So just there we go. Take a little second for it to start. And we see we see that there’s actually an error, right? And it’s even telling us exactly where the error is. So if you go through error, one of the immutable files have been changed. And specifically, it’s saying it right here. You see where the plus sign is. It’s saying that someone added something to the default client header slot. And so the recommendation at this point is if we wanted to change one of these files, for example, this one, what we would do is we would not put that configuration there. And what we would recommend is to not change the immutable file, but we would make a link inside the client headers file to one of our own files. And so here I’ve created one called custom client headers dot ending. Inside of there, I’ll add my configuration. And so I’ve made removed my configuration. And I’ll rerun the check.
See right now, it’s actually catching that I’ve left the whitespace there.
And so it still failed. And it’s letting me know that that would have been tricky to actually catch. And a pipeline could have failed because of that. Right? Now we know everything is successful. And that would have saved me 45 minutes of waiting for the configurations and the deployment process. Right? And so it’s important to use these types of tools to actually cap these little errors. A whitespace sometimes is one of the most painful things to realize that that’s the reason why something failed.
Now, the dispatcher SDK is a static analysis, right? So what does that mean? It means it takes in the configurations and just runs validations, but it doesn’t allow you to actually interact with it per se. Right? And so how do you quickly test something that, you know, you want to get more out of just the configurations other than just a static analysis? You want to be able to perhaps interact or leverage it in an actual runtime other than like a Docker runtime used for this type of analysis. And so what we would do at that point is we recommend the RDE, a rapid development environment. Now a rapid development environment is something that maybe not everyone may have access to. And that’s okay. We’ll talk about other solutions as well just after this. But if you do have access to it, it’s a really nice tool that we can use to actually do one step further from the SDK, which allows us to actually do the static analysis, but also interact with the configurations at a runtime level. Right? And so the only thing you really need at this point is two things. You need to have the credit to be able to run the dispatcher configurations inside of the RDE, which means you need to have the credit for the RDE. And you need to have AIO, that’s ÃÛ¶¹ÊÓÆµ CLI tool for various applications such as ÃÛ¶¹ÊÓÆµ IO. There is the RDE plugin, there’s also a plugin for Cloud Manager. And I believe there’s a few other plugins as well. And so today we’re going to focus specifically on the RDE one. Now, once you’ve installed AIO, and you’ve installed the click on the two hyperlinks that I have in the slide to actually get that information on how to get that installed on your machine, you then also need to make sure that you’re running node 20 or higher. Right? You’ll use AIO to authenticate to your IMS org, which then will allow you to actually take a look and view the RDEs that you have set up inside of your environment. Now, there’s a few little things, right? Once you’ve got AIO set up, and you’ve done the login, just make sure that you log into the correct org. Sometimes you may have multiple organizations. And so you just need to make sure that you select the org that matches the RDE environment. Alright, so let’s go in and let’s take a look at what that would look like.
Alright, so I’m going to close the previous tabs that I had. Let’s just focus on this one here. Now, I have, I have two dispatcher, or sorry, I have two v hosts set up inside of my dispatcher configuration. So the first thing that I want to do after I built the project from the archetype, do we make a few changes, right? So here, I’ve had some some changes that I’ve made into my v host. So here I have an x v host ÃÛ¶¹ÊÓÆµ underscore CVM. I’ve used that because I want to understand what v host I’m using to actually render content. That’s extremely important because when we make requests to AEM, you want to know which v host that we’re using. If you have 10, 15, 20 v hosts, you don’t want to assume that you’re using that specific v host. So adding little hints like this is extremely important because it allows for faster debugging. So let’s go into the RDE dispatcher folder, dispatch to our cloud. And I’ll just build my project, maybe clean install. That’s the the tooling that we recommend for building the AEM archetype. Now we’ve built the archetype, we have our code base built. I’ve already authenticated inside of my RDE. And I will just do AEO RDE, sorry, AEM RDE status.
That was pull up the status of the RDE and give me a bit of information on what’s currently on there.
It just wants me to log in real quick.
All right. And here we can see what’s on there, right. So I’ve used this RDE to actually throw on a few quick things. I’ve thrown on weekend, which is the demo guide that we use to, you know, content and a few configurations, OSJI bundles and all that kind of stuff. So I’ve thrown that on there. And I’ve also thrown on the dispatcher. So I’m going to throw the dispatcher back on. And so how I’m going to do that, I’m going to run this command. So I’m going to say, AEO AEM RDE install. So I’m going to run the install command. I’m going to point to my dispatcher configuration, which is the zip file. I’m going to tell AIO it’s a dispatcher configuration. And the target is a pouch. And so without having to run an entire pipeline, I can now do the static analysis because the static analysis is part of this deployment. But on top of that, I also get to interact with the application, which that means I go one step further, but I can do all this in five minutes, not even.
Right. And so that allows me to now debug and validate one step further, right, without having to spend a lot of time trying to understand what I can do, or like how I can debug something inside of another environment. The RDE has a few advantages. I have CRXDE, I can throw on a few OSJ configs, I can throw on a few other things very quickly. And now if I wanted to validate that the configurations are working as expected, I will then just go here.
And I have a published environment that I’m using right now to do some validation. And if I refresh my page, I can see that my vhost is actually being used and my configuration is being applied. And I’ve done all this in about three minutes. And so if you have any issues, or you’re trying validate quickly, this is a great way to go one step beyond the static analysis of the dispatcher SDK. It allows you to make configuration changes either in the vhost or in the actual dispatcher module and deploy them in real time. Now as part of the entire deployment, what you see right in this screenshot is at the bottom you see syntax okay. And that gives you an idea of the static analysis is correct, and that there’s no run or like any issues that would affect runtime.
It’s really important because the last thing you want is the configurations to be thrown into the RDE and it’s not working as expected. All right, so if you see any issues, you can actually see right as part of the command. You’ll also see some recommendations.
You see here the ignore URL parameter setting is not enabled. And so we could actually go into the dispatcher configuration and make those changes so that we actually do ignore some of those parameters. Right. And why do we want to do that? Because that would allow for the dispatcher to not send all those requests over to AEM. It would see these parameters and ignore them and know that the content already exists in cache and just serve it from cache. Right. As soon as we add a new query string parameter, the dispatcher assumes it’s a new URL. Because of that, it’ll go to the AEM. Right. And every time we do a transaction inside of AEM, if it’s not necessary, it could be taking away resources from a transaction that we actually do need to have happen. Right. You could be doing a publish action. And so the publisher at that point is receiving content from the author processing that. And we’re also, you know, having requests that could have been mitigated. Right. So optimizing the dispatcher, even if it’s just a warning is extremely important, because we don’t know what type of transactions could be happening at any point in time. And mitigating the unnecessary ones is always something that we want to do.
Okay. So if the RDE is not applicable to you, right. And that’s fine. If it’s not, then the next best thing is the web tier pipeline. And the web tier pipeline is a utility that we offer, which allows us to do a deployment of isolated code. And so when you generate the RDE, you can have all sorts of configurations. You can have your WGIC configs, you have your components, you can have content, Sling context-aware configurations, all sorts of stuff. Right. We all know there’s a lot of moving parts inside of AEM. And so if we don’t want to have to deploy all those configurations, we just want to deploy the dispatcher. We can leverage the web tier pipeline. And what that’ll do is it’ll still give us the static analysis that the dispatcher SDK gives us. Right. And it’ll do the deployment to a Dev instance. Or if you do a production deployment, it’ll go to QA and to prod or stage as you say, and prod. Right. Because those two environments are always kept in sync. Now there’s always the approval button you need to press. So if you don’t want it to continue to prod, you can always cancel it or do your testing in stage before you do that validation. But this here is another quick way to validate your configurations and actually leverage them in an environment in order for you to test, to test to see if you’re actually getting the results you’re expecting with the content. The advantage of this one compared to the RDE is that most of the time, the RDE doesn’t have as much content as a Dev instance may have. A Dev instance may have more content because you could be using content copy, taking content from prod to Dev. And for that reason, the dispatcher will be able to test more rules, right. You could test the, you could test like any rewrites or whatnot with more coverage at that point in time. Nonetheless, though, it’s still fast and allows you to make those deployments quickly. Now, compared to AIO, you can still go in and get information as well. So if you do the deployment and you get the information that you need, you can go to the logs and download specifically the dispatcher configuration, right, or the dispatcher log file. Because if you’re getting some issues where you’re not seeing the results that you are expecting to see, then you could actually go in download the dispatcher log, the HTTP access log, and the HTTP error log. And the difference between these is that the dispatcher log will log information that comes out of the dispatcher module, right. So when we were looking at the demo, we saw two folders, right, conf.d and conf.dispatcher.d. conf.d is just the HTTP module, right. That’s where you have your vhost, your rewrite rules, and any other types of modules that are not related to the dispatcher. And so HTTP access and HTTP error would be where that information would be logged. So if you’re seeing issues with rewrites or seeing issues with any other module other than dispatcher, you would see that in there. The dispatcher module is where you would see if the content was served from cache or if it actually connected to the backend. And if it connected to the backend, then you would see all those connections being used, which farm that it resolved, and all that information so that you determine what was happening when the dispatcher was making or receiving requests, right.
Now, if the pipeline fails as part of the deployment, which is entirely possible, because there is a little bit more testing that happens inside of the web tier pipeline compared to an AIO deployment. And I’ll tell you what those are. When we do a web tier deployment, some of the additional tests that are executed are dispatcher cache and invalidation tests. That’s a huge, huge test, right. We want to make sure that this invalidation runs as expected.
Because there’s nothing worse than replicating content from author to publish and seeing it not reflect, right. Especially if you have a high priority event, or if you have something with the urgency that needs to be pushed out, we want to make sure that this content is effectively removed from cache and served fresh. And so when we run a web tier pipeline, we have some modules that will connect and communicate to the dispatcher and validate that the actual dispatcher invalidation servlet does return the expected results and is working as expected, right. And so these types of tests here are added as part of the web tier pipeline. And so if you do see errors or some sort of some sort of error per se, you may see that only in the web tier pipeline and not in the AIO or in the static analysis. And so I just see like the longer it takes to do the deployment, the more the reason why that is, because there’s more testing being done. Nonetheless, we can still get a lot of really good information using the SDK purely or AIO. And then finally, the web tier pipeline to get some of this information from this log here that we see the reason why this specific pipeline fail is because when we configure the pipeline, the path that we gave to the pipeline to the dispatcher codebase was invalid. And so when we ran git clone of the repository, and then went into the codebase to go run the dispatcher build and failed because it couldn’t find the necessary files in order to do that. And so the pipeline failed. Okay, now, for this part, when we talked about, so like what we talked about before was how to optimize our testing and validation of the dispatcher configurations using various ÃÛ¶¹ÊÓÆµ tools to do quick analysis, either static or non static analysis. But now I want to discuss a module that I feel like not many people are aware of. And so the dispatcher does allow for advanced networking. Now, if you’re not aware of advanced networking, here’s the TLDR. Advanced networking is a module that exists that allows you to communicate to external services.
A lot of customers have external services that are used to retrieve either live data, right? For example, you could think of like a weather app, right, which is a simple scenario, but we want to have to go and get that data necessarily from a person, right? If we have to go from your browser to the CDN to the dispatcher into the publisher, and then route to your service, just to go get that we’re doing unnecessary transaction against the publisher. And we don’t need to do those, right? We know what we need to go get. And we can offload those at the dispatcher and go get the necessary information. And so we can leverage cloud manager to create the advanced networking configurations, we create our network topology, we create our advanced networking configurations. When you create your network topology, you have various sorts of network configurations, you can do flexible, you can do dedicated, and you can do VPN being the most advanced one, which we strongly recommend to have your network, your networking team help you with those configurations, since it requires a lot of specific information. Now, when we leverage this, what we want to leverage is a module of the HTTP service, which is known as mod proxy. And so what we do is we will define locations. So as you see from this code block, we define a location. And what we say is, is any request that’s coming in, in this scenario would be a specific URL, it’s not a location match, it’s a pure location, which means it has to match it has to be exactly that there’s no regex or whatnot. And so every request being made to API mock bin, which is just like a testing service, if ever you want to do some sort of validation against this, will be sent off using the proxy. And so what you see here is the proxy remote match service. And what that does is it says, Alright, when we see this request, we’re going to proxy this request from the dispatcher to the proxy service, which we define using the AEM HTTP proxy host and AEM proxy port. Right. And these allow us to proxy content from the dispatcher to the proxy service. And this then means any of these requests are then removed from the AEM. They don’t even go to AEM, they stop at the dispatcher and get forwarded off. And that allows us to save special or save transactions against the publisher, the publisher doesn’t need to run these transactions. And we can load them completely. And we can leverage our existing network topology. If you have dedicated the egress IP, then all the all the transactions the dispatcher will do will be sent through the proxy service. And the origin being the customer service will see that dedicated egress IP, meaning that you can do IP allow this thing to that point. And the dispatcher will be using that dedicated IP. If you have private infrastructure behind a VPN, we can also leverage that. And that allows us to save all those transactions, the eight that the JVM would be doing, right, we don’t want to necessarily leverage the JVM. If you don’t have to, that’s the reason why we have the CDN and the dispatcher is to protect the JVM as much as possible, and only leverage it to perform things that none of these other tiers can do. Right. And so yeah, so so that, that right there is one of the most important ways that we can actually save some of the transactions against a. Okay, so let’s talk about some of the common pitfalls that I’ve seen, and some of the and some of the things that we can do to prevent these. So if you build your dispatcher configurations, always use relative paths. If you use absolute paths, then any, any pipeline will fail, right. And so we use relative paths, because an absolute path will not match what is inside of the actual runtime, right? If you think about it, your absolute path may have your name in the URL to the file or whatnot. So relative pathing is a must. Just make sure to always adhere to that.
Use a unique x vhost header. If you start having multiple vhosts, you’re gonna want to be able to understand which vhost is responding back to you. Right. And so leveraging a unique x vhost header is going to save a lot of time having to debug these issues. The last thing you want is to have to do a production pipeline release of the dispatcher configuration because you’re having a hard time debugging what’s really happening. Use specific server aliases, but always leave one generic one to make sure cache validation can be invoked. The recommendation is always we use localhost. And so localhost is the dispatcher and validation vhost. And AM is a cloud service, because internally, everything can talk to each other on the same network, let’s just say. And so all these systems have the ability to use the local loopback loop. And so leveraging localhost is actually the default or the dispatcher and validation vhost. If you don’t have that in there, your pipeline will fail and you’ll see it happen there. But it may not be apparent as to why that’s happening. So always having one vhost, you know, call it whatever you wish. But matching localhost is important because that’s the one that’s going to actually be used to do the dispatcher and validation.
Make sure proper client headers are allowed, right? If you’re expecting a certain header to be present inside of your servlet, make sure that these are allowed to AM. If not, it’ll be tricky debug as to why your servlet may not be behaving the way you’re expecting it to, because at this batch level, we’re actually ripping out certain headers.
Now, lastly, the CDN is your best friend for caching. But it only knows what to cache based off the cache control headers. And the best place to set your cache control headers is inside of the dispatcher. And so if we go back and take a look at my screen here, we see that I’ve actually set a cache control header. Right, then we’re managing the cache at the CDN level based on these cache control headers. And if we take a look at how we set these, I’ve actually set these using an environment variable, content mass age. And so you can leverage environment variables inside of your dispatcher configuration. What does this mean? It means that if you ever wanted to change your dispatcher or your CDN cache lifetime, you could just simply change the environment variable, not even have to run a pipeline. And all transactions being made at the dispatcher level would now start to return a new cache control without even having to run a pipeline or anything. And so now you’re able to control your TTLs in production, stage, dev, now you have run a pipeline, simply updating an environment variable.
As for best practices, leverage only the supported modules that we’ve defined in our documentation. If you leverage or try to use any other modules or try to force install modules, the pipeline will fail. I would strongly suggest that you review the supported modules.
Leverage mod proxy to basically deal with any transactions that could be dealt with without having to deal with or talking to AEM. So if you could proxy any of those transactions away from AEM, do so because at that point, your JVM will be able to focus specifically on what it needs to do.
Keep the configurations clean, use comments so that you understand why you’re putting this there. So that if ever you start seeing odd behavior, like a HTTP, I don’t know, a 301 or 302, or any other types of HTTP codes being sent back to you, maybe check your dispatcher configuration and see if you’re setting those. And if so, maybe you have left yourself a comment that would help you understand why something is happening. And use a dispatcher SDK web tier pipeline to validate dispatcher configurations. That way you save yourself having to debug a production system with little to no information.
And that’s it.
All right, a few questions.
Let’s see here. Okay.
Okay, yeah. So how do you configure vanity URLs in the dispatcher? Okay, so if you wanted to configure the vanity URLs in the dispatcher, there is a inside of the dispatcher configuration. If we go into the enabled farms, I can just share my screen here. There’s a comment right here. If you uncomment this, it’ll enable the vanity URL configuration. And you just need to actually install a quick little module inside of the dispatcher, which are inside of am, which will expose this URL and allow you to go get the content that defines the vanity URLs. Now vanity URLs, one thing about them is that they don’t adhere to filters. So what does this mean? It the way the dispatcher is going to validate this is it’s going to say, Is this a vanity URL? Yes, then don’t look at the filters allowed through. And the reason why it does that is because we as a human being have defined that URL, we know that it exists. And so we wouldn’t want to block ourselves after we’ve explicitly defined it. Right. And so so don’t worry about filters or allowing the vanity URL through. You don’t have to do that. You just define the vanity URL inside of am, make sure the module is installed based off the documentation link and enable this configuration and you’ll be good to go.
All right. See, there are three max age values. What is the Oh, yes, three. Yes. Okay.
Right. So good call out. So there’s actually two, but I put the same thing twice by accident. So ignore the max age twice. But there’s s max age and there’s max age. s max age is the configuration is the TTL at the CDN level. So you can actually manage the time to live at the CDN. And so the document would live at the CDN for 3600 seconds. And max age is actually defined at the browser level. So your browser is going to cache that document for 300 seconds. And so you can actually define how long the document lives in specific tiers, right? So inside the dispatcher, typically we use stat files, which is those invalidation based off of publication of contents when you publish content, then it gets invalidated. And then at the CDN level, we use max age. If s max age is not defined, if s max age is defined, then we use that one. And then the browser uses max age. So you can kind of see how each tier has its own setting per se. But the dispatcher will be the one that’s managing how long that content lives. And you can use a location match block to say, well, this, you know, like a JPEG can live for this long, a PDF can live for this long, an HTML file can live for this long, and so on and so forth. Hopefully that answered your question. Um, impressed. Okay, okay. Um, we use external CDN solution, still set headers on dispatcher. Um, yeah, so if you use, bring your own CDN, you should always define at least some caching structure like, and it depends too. So if you use if you bring your own CDN, some CDNs offer like a GUI, and you can just kind of go in and say like, if you see an HTML file, let it live in the CDN for this long, and so on and so forth. And so you don’t really need to define the headers in in the dispatcher or whatnot, right. So it kind of depends on the on the CDN that you use, because sometimes I say have like a nice GUI that allows you to kind of interactively set these rules. But if you don’t have that, and it’s, you know, a simplistic CDN, then yes, you need to set these headers somewhere there, you can set them at the dispatcher level. There is another way that you could set it. Recently, we have the config pipeline, which I didn’t talk about in this session, but it allows you to set headers at the CDN level. And so if you had your your CDN, our CDN, and then the dispatcher, you could choose to set the headers that go to your CDN at our CDN level. It’s there’s a lot of moving parts, but you get to pick if you want to set in at your at our CDN level, or at the dispatcher level. But nonetheless, if you don’t have an interactive GUI that allows you to set them inside of your CDN, it needs to be set somewhere. And if you’re going to use the dispatcher to set the cache configuration headers for our CDN, you might as well use the cache configuration, this factor to set the cache configuration for your CDN also. That way, it’s all in one place, and you don’t have configuration files kind of all over the place. Right? I think it’d be a bit more of an elegant solution if it was all managed in one set in one set location.
Okay, yeah. That’s a good question. Gwen, are there best practice values for the cache control headers? Complicated question. It depends on the content per se. Okay, for example, JS, CSS, usually stuff that doesn’t really change, you build a CSS, you build the JS, and it kind of lives forever, right? Like, it’s very rare, unless if there’s like a really bad bug that you found inside of your JS file, and you needed to push it out, then at that point, you can use well in AM as a cloud service, you would actually have a URL hashing. So when you do a deployment, you get a new URL. And so the cat the content will be invalidated. But when you build your JS file or CSS file, most of that stuff will live forever. So you can crank that TTL to quite a large number, you would just need to render it once and leave it in the cache, to be quite honest. So that kind of stuff there doesn’t really matter. Even assets, right, like an image of a like an image of a set dog is not really going to change. So why make it a short TTL, right? You can leverage a longer TTL there if you needed to, dealing with a PDF and all that kind of stuff. Now HTML, though, I would argue that it’s tricky to say how long an HTML file could live. We have authors that are continuously publishing the content. And so if your TTL is set for two days, and your authors make a change today, and they can’t see till tomorrow, it’s a bit of an unfortunate experience. And so, yeah, like it’s, I would say for HTML, you would lower it, maybe, you know, 15, 20 minutes, depending on the content. If it’s content that you know won’t really change very often, then you can make it a little bigger. But everything else, you know, it’s kind of played by ear. But CSS, JS, that can stay for like quite some time. HTML, it really depends on how often your organization changes content. And as per assets, that stuff too is kind of similar to CSS and JS. You know, you build the asset once, and then it’s kind of lives in there forever.
Hopefully, I kind of gave you a bit of a runway there. Randy explained more about the load balancer. Okay, well, so the dispatch itself kind of acts as a load balancer per se. But if we’re talking more about, you know, load balancing, as like, like for advanced networking, then leveraging mod proxy is what we recommend, which is a module that HTTP exposes. Now, if we go, and actually, if I just share my screen real quick, I can kind of show you a few examples.
And I’ll just share my screen in two seconds. Just trying to get to the page first.
source.com.d And I’ll share my screen now.
And so if we’re talking about the load balancing, as part of proxying, if you look at the default archetype, we actually have an example built into it, it’s publicly available, it’s not like private code or anything like that. And we actually showcase how to use it. So we see here, we actually leverage the mod proxy module to take content that is requested by customers browser goes through the CDN. And when it hits the dispatcher, it says, Is this a request that looks like API GraphQL? If yes, then forward it off to commerce, right? Because am has a module that actually talked to commerce, and uses that to go get products goes and gets pricing and catalogs and all sorts of stuff, right? And we use it, and the ÃÛ¶¹ÊÓÆµ commerce engine uses GraphQL. And so it can actually see these requests come in and affords them off using mod proxy. And so we see here, this kind of acts as a load balancing effort, but these transactions would not go to right. The dispatcher module itself, which kind of acts as a load balancer, cache, and whatnot, is also, as you can imagine, a load balancer in itself, but goes directly to a right.
So hopefully that answered your question there.
I’m trying to take a look here.
I’m not sure about the what the next question was, what algorithm do we use in background, but if you maybe just give me the update your question, we’re not in one of the one of the the presenters can actually help you with that one.
Okay, so how can you configure egress IP, so the egress IP is actually configured using cloud manager API’s. And so as part of if you look up the cloud manager API spec, it’s just a public document, you can go in and then inside of your Amazon cloud service environment, you can basically define what type of environment you want either elastic egress or VPN. And that right there is what’s going to define if it’s egress VPN or elastic. And so you don’t actually have to tell the dispatcher per se, when it goes through the proxy, that will already be configured as part of your setup that you’ve done in cloud manager, right. And so one of the presenters can share the documentation with you. That just discusses how to set that up so that you can like so you can see how you can leverage that. Mohan, is there any tools to test secure caching? Well, there is, if you mean like content is secure at the dispatcher level, I’m not sure if you mean like the content itself on disk, or if you mean requests to am because there is sensitive caching, that is a module that is exposed inside the dispatcher. And what that allows you to do is allows you to define a set of URLs that dispatcher has to validate before it can actually serve the content. And so let me take a quick look to see if I can find that for you. I just want to see here, there is if I can just pull it up real quick. Let me just see if I can pull it up for you. Okay, two seconds.
I’m just pulling it up for you.
And so there is an entire module that actually talks about sensitive caching for the dispatcher. And so what that does is when a when a request comes into the dispatcher, it says does this URL match the sensitive caching definition. And so we can actually go in and we define a set of URLs. And so see, we define what is known as an auth checker inside of the dispatcher. And it says, anything that looks like these URLs here, we want to validate if they’re allowed to be served. And if not, then at that point, we deny it. And the way we define it, it’s allowed to be served is we have to expose a service. So there’s a little bit of coding, right. But there is an example on the website. And this tells you a quick little setup, right? Not necessarily exactly how you want to do it. But it’s a little example of what what you can do. And so if the servlet doesn’t return a 200, anything other than 200, then the dispatcher would know not to serve that content, because it is not meant for the viewer that is requesting it. Right? We didn’t like discuss that today. But good question. And so there is a module that can help you with that.
See here.
publisher is down. So we do have alerting on our side, if the publisher or any instance is down, that would immediately call on the engineering on call, or the CSME to basically see what’s going on and go in and do some sort of validation, we have a bunch of monitoring built into the publisher dispatcher tiers, which continuously send ping. So we have a health check service. So if ever you download the logs for the dispatcher, you’ll actually see a cache and validation request. It’s a test cache and validation request some actual cache and validation requests. But this would actually if the test cache and validation request, which runs, I believe every 30 seconds, doesn’t get a response back, there is an alert that is sent to the on call engineering team, or your specific AMS, a cloud service instance, which would then dispatch the engineering team to actually go and take a look. Oh, if you wanted to, okay, so I do have a little bit of experience with a M forms. So the best way to validate if, or to determine what type of request you’re going to need to allow in a dispatcher is you’re going to have to open up your adaptive form, open up the network tab, do a submit, you should see I believe the doctor forms do two submits.
One of them is a request to am to get the guide state. And then the other request is to submit the guide data. Both of those requests need to be allowed. And I believe they’re both post requests. Look at the URL from the network tab, and then allow a regex similar to that, so that is not specific to that form per se, but forms of that nature. And then you can allow post requests to the AMS server, right from the dispatcher. But this would be a filter rule, right? So you would allow the URL with the method post, and maybe like some sort of regex pattern, because you don’t want to just hard code it to that specific form.
All right. We still have quite a bit of questions.
How can we bypass caching in ÃÛ¶¹ÊÓÆµ Fastly CDN as we handle caching on our own CDN? So if you want to bypass caching on the ÃÛ¶¹ÊÓÆµ CDN, then you would set the cache control headers from the dispatcher to zero. So you would force it, right? Just don’t set any or just set them to a very small value, right? Zero or whatnot. But then at that point, the CDN would not cache anything. It would just basically be a pass through. And then you would be able to serve or manage your CDN configurations at your tier.
Why don’t I see we have vanities enabled, but the features with them working as intended. I would say at that point, maybe you may have a rewrite rule that is kicking in that is causing this to happen. One of the quick things that I would recommend is maybe comment out a few of your rewrite rules that look like they’re ripping away .html or whatnot, and then do an AIO deployment to validate if that’s actually working as expected.
How can we test config code for compatibility with AM as a cloud service before fully migrating? Again, I would use the SDK. The SDK would do static analysis against your code base. And if there’s any issues with your code base, then it would be flagged by the SDK. That’d be the perfect use case for it. Different external versions like CDNs.
Yeah, so you can define multiple server aliases inside of your vhost configuration file so that you can actually, you know, like one vhost can have its server alias command and then you have just basically a list separated by a white space of server aliases. And that vhost would be responsible for responding to any external origin that matches that external hosting. So one vhost can have multiple server aliases. It doesn’t have to be one vhost, one server alias. It’s just a list separated by one white space and that contains all the information of all the external origins that are allowed to be rendered by that vhost.
Jacob, so if you’re saying you have a lot of vanity URLs in AEM, yes, that will take some time. The vanity URLs are executed or they’re loaded at AEM startup. And so if you have a large list of they’re all stored in an array in memory and they’re queried. And so that query can take time if you have a lot of vanity URLs. And so the recommendation at that point would be to either use like JCR resource resolver to kind of map URLs or offload that to the dispatcher because that is actually going to render a lot faster. As mentioned, vanity URLs are properties on CQ pages and that needs to be queried at runtime. And so a query that returns a million results or more can actually cause problems. And so lowering that and offloading it to a different location, either the dispatcher and so the rewrites remain at the dispatcher level would increase performance for you. And I strongly recommend that because pipelines have timeouts. And if the pipeline itself takes too long to run, so the authors take too long to run because there’s issues like this where running a query takes a lot of time, then you could have pipelines start to fail, which would be really bad. So we want to try to offload that and move that away.
Can we define TTL for assets on the dispatcher or is publishing the deployment the only way to flush the cache? Yeah, so the easiest way to define a TTL for assets on the dispatcher would be actually using the CDN. And the reason why using the CDN is the best way to set a TTL for assets is because assets are actually not served by a that makes a huge difference. So AVMs at cloud service uses uses direct binary download, which is not a feature that exists in on prem or AMS. On prem and AMS, the binaries are served out of the JVM. But for AMS cloud service, what happens is when you make a request to AM, the asset is not returned out of AM. Instead, what happens is there’s a 302 that comes out of AM, and the dispatcher can’t cache a 302. And so what ends up happening is that even if you were to set those headers, the 302 gets sent to the CDN with your headers, and the CDN knows to follow the redirect. And so that’s why when you reference an asset in AMS cloud service, you don’t see a 302 and then a 200. You only see a 200 because the CDN went, followed the redirect, downloaded the asset, and then at that point, you lose the headers you set at the dispatcher level. And so setting them at the CDN level using a config pipeline would be the way that you can actually manage how long the asset lives. And so you would define a max age for the actual asset, which then your browser would actually adhere to that and store the asset for a certain amount. Can we see rewrite rules in the log file? There is, I don’t know the configuration, but there is a rewrite logger configuration, and the dispatcher would actually have those configurations, or you could set the rewrite log configuration so that you can see it and have it append to the HTTP error log file. How, so to clear, do a full dispatcher clear, cache clear in AMS cloud service. So there, there, we, well, okay, when you run a pipeline, that will actually clear the dispatcher cache completely because it spins up a new dispatcher. And at that point, the cache would be clear. But to do a full cache clear, there’s not really a way that you can do that. So the best thing would probably be to work with either your CSME or some other ÃÛ¶¹ÊÓÆµ representative that you may be assigned to, to help you do these types of operations that we don’t really offer as self-service.
Alrighty, we are at time. And so thank you, everyone. Thank you for participating and being part of the session today. If you have, if I haven’t, I can answer your question. And really appreciate a follow up, please feel free to support to submit a support ticket with your questions. And the support will do their best to provide you with the answer as fast as possible. I really appreciate your attendance and you being here. So with all that being said, please enjoy the rest of your day and enjoy the rest of your week. Bye bye.
Awesome. Thank you, guys. And just a quick reminder, a link to this recording will be available on ÃÛ¶¹ÊÓÆµâ€™s Experiancy website, should you wish to view it again or share it with your colleagues. Thank you all again for joining us and we hope to see you all in our next session. Bye.
Key Takeaways
-
Dispatcher SDK for Validation The AEM Dispatcher SDK is a powerful tool for static analysis of configurations. It allows quick validation of configurations, checks for immutability, and identifies errors, saving significant time compared to full pipeline deployments.
-
Rapid Development Environment (RDE) RDE provides an interactive runtime environment for testing and debugging configurations beyond static analysis. It enables faster validation and debugging, reducing the time required for deployment and testing.
-
Advanced Networking with Mod Proxy Advanced networking configurations, such as VPN and dedicated egress IPs, can be set up using Cloud Manager. The mod proxy module allows offloading transactions from AEM to external services, optimizing performance and reducing load on the JVM.
-
Best Practices for Dispatcher Configurations Key recommendations include using relative paths, unique x-vhost headers, proper client headers, and leveraging cache control headers to manage caching effectively. These practices help avoid pipeline failures and improve debugging efficiency.
-
Web Tier Pipeline for Deployment The web tier pipeline is a utility for deploying isolated dispatcher configurations. It includes additional tests like cache invalidation, ensuring content updates are reflected promptly and accurately in production environments.