Althttpd

URL redirection into directories unnecessarily includes the entire path [patch]
Login

URL redirection into directories unnecessarily includes the entire path [patch]

(1.2) By spindrift on 2025-10-23 17:42:48 edited from 1.1 [link] [source]

When althttpd redirects because the original URL is a directory, but does not end in "/", the source code comments suggest only the trailing / character is added to create the redirection target. This is in keeping with most web servers' behaviour.

However, instead, the entire URL to the final resource is provided by althttpd (ie "/index.html" or alternative may be added to the redirection location).

When the URL is a directory and does end in a / character, no redirection is performed (correctly), and the appropriate file is seamlessly returned from within.

Proposition: While the redirection is required, it should only serve to add the / character, and not the entire new resource path. This is in keeping with the apparent intention in the source code comments, and also with behaviour of other web servers.

Example:

where the file https://example.com/abc/index.html exists.

Browsing to: example.com/abc/ currently leaves the URL untouched and serves the index.html file.

Browsing to: example.com/abc (note no training slash) redirects to example.com/abc/index.html

This should instead redirect to example.com/abc/ and serve index.html from that directory transparently.

Previously when making suggestions, it has been mentioned that patches are thoughtfully considered.

Thus, an attached patch achieves this change.

As I have previously disclosed, I am NOT a c programmer (this is the third time I have ever put a patch together or dabbled with c. Ever!) and so this is likely unsafe, insecure, ugly, and should be treated with suspicion.

I have a particular lack of knowledge regarding languages with manual memory allocation. I have never even looked at malloc or free before.

It does, however, appear to work.

Corrections and comments welcomed.

Index: althttpd.c
==================================================================
--- althttpd.c
+++ althttpd.c
@@ -3597,11 +3597,14 @@
         ** none of the relative URLs in the delivered document will be
         ** correct.  Except, we don't need to redirect if the URL
         ** consists of just a domain name and nothing else, for
         ** example "http://sqlite.org" without a trailing / does not
         ** need to redirect. */
-        Redirect(zRealScript,301,1,410); /* LOG: redirect to add trailing / */
+        char *zURLtarget = SafeMalloc(strlen(zScript) + 2);
+        snprintf(zURLtarget, strlen(zScript) + 2, "%s/", zScript);
+        Redirect(zURLtarget,301,1,410); /* LOG: redirect to add trailing / */
+        free(zURLtarget);
         return;
       }
       break;
     }
     zLine[j] = zScript[i];

If snprintf is not sufficiently portable, I did consider the following alternative. I lack the context and experience to know which of these options is likely to be preferable. I have not tested this patch version.

Index: althttpd.c
==================================================================
--- althttpd.c
+++ althttpd.c
@@ -3597,11 +3597,17 @@
         ** none of the relative URLs in the delivered document will be
         ** correct.  Except, we don't need to redirect if the URL
         ** consists of just a domain name and nothing else, for
         ** example "http://sqlite.org" without a trailing / does not
         ** need to redirect. */
-        Redirect(zRealScript,301,1,410); /* LOG: redirect to add trailing / */
+        size_t nn = strlen(zScript);
+        char *zURLtarget = SafeMalloc(nn + 2);
+        memcpy(zURLtarget, zScript, nn);
+        zURLtarget[nn] = '/';
+        zURLtarget[nn + 1] = '\0';
+        Redirect(zURLtarget,301,1,410); /* LOG: redirect to add trailing / */
+        free(zURLtarget);
         return;
       }
       break;
     }
     zLine[j] = zScript[i];

(2) By spindrift on 2025-10-25 17:30:15 in reply to 1.2 [link] [source]

Just for information's sake, this has now been running on my primary host for 48 hours with no apparent bugs, as the second (memcpy) version of the patch.

No crashes, no obvious memory leaks, and I'm really pleased with the advantages of having site.com/webapp style URLs redirecting to and working as site.com/webapp/ rather than site.com/webapp/index.html.

I also like that this now makes one canonical URL for both https://site.com/webapp and https://site.com/webapp/ without the first redirecting to a completely distinct URL.

Logging also working fine.

Fossil repos on the same web host are seemingly still handling their own redirection successfully and working unchanged, but I don't have deeply nested fossil servers (eg if you were running a fossil instance as a CGI from another instance that was hosted in a directory) so I can't evidence whether this sort of usage breaks.

However, it's reached the "works for me" level of use.

Patch can be used for any purpose without any conditions etc etc.

(3) By sodface on 2025-11-05 00:16:52 in reply to 2 [link] [source]

I haven't looked at or tested your proposed patch, but in general I agree with you and would also prefer this behavior. Even though my site is lame and I rarely update it, I like the underlying mechanics of it, using index when I need CGI, index.html for static content, and index.cgi to regenerate the page after a change to my homegrown templates/scripts. I made sure to use links with a trailing slash so althttpd silently serves the right file and the user's url bar stays tidy, without the redirect and appended filename.

(4.1) By spindrift on 2025-11-05 07:26:59 edited from 4.0 in reply to 3 [link] [source]

I'm finding it effective (and very pleased that it works, though potential improvements are likely and would be welcomed!).

For demonstration purposes see eg here:

https://ndoa.uk/countdown/

and here

https://ndoa.uk/countdown

Which both serve from the same canonical URL rather than the second redirecting to https://ndoa.uk/countdown/index.html

Edit: I'm particularly unclear about calling Redirect in that I think the 4th parameter is meant to be the source code line from which it is called. I think __LINE__ exists in C to serve this purpose, but I can't see any usages of this in the rest of the code, and the original call which I have replaced seems to use an arbitrary number for this purpose which doesn't match the source code line. So I've just left it alone, but I'm aware that I don't understand what it is meant to be set to.

As mentioned previously, reviewing the althttpd.c source code represents the entirety of my C education...

(5) By sodface on 2025-11-05 15:10:38 in reply to 4.1 [source]

Tested (but not extensively) and working for me is this one line addition, can you test?

--- althttpd.c.orig
+++ althttpd.c
@@ -3599,6 +3599,7 @@
         ** consists of just a domain name and nothing else, for
         ** example "http://sqlite.org" without a trailing / does not
         ** need to redirect. */
+        zRealScript[i+1] = 0;
         Redirect(zRealScript,301,1,410); /* LOG: redirect to add trailing / */
         return;
       }

(6) By spindrift on 2025-11-05 17:45:53 in reply to 5 [link] [source]

Ah. That's very succinct. I didn't know if it was safe to mutate zRealScript.

Just to check my understanding, you are null terminating the string a bit early?

I'll give it a whirl when I have an opportunity.

(7) By sodface on 2025-11-05 20:58:08 in reply to 6 [link] [source]

Just to check my understanding, you are null terminating the string a bit early?

Exactly, by the time we hit this section of code (as you know) we've already decided a redirect is needed and appended the slash and filename to zRealScript. i+1 seems to always be the position of the first character of the appended filename so we replace it with null, effectively chopping off the filename but keeping the slash. So far as I know this is perfectly safe to do.

(8) By spindrift on 2025-11-05 22:08:22 in reply to 7 [link] [source]

Given some of the recent criticism on the SQLite forum about text functions truncating strings "early" at the first zero byte, I find this elegant solution a rather delicious irony 😆

(10) By spindrift on 2026-03-08 09:09:58 in reply to 5 [link] [source]

Incidentally, I've now been running this (with your patch) for 4 months and no issues.

Working fine for me, including embedded fossil instances served via CGI.

(11) By Stephan Beal (stephan) on 2026-03-08 09:54:29 in reply to 10 [link] [source]

Working fine for me, including embedded fossil instances served via CGI.

Just to clarify: the one-line patch is working for you, or the bigger patch with that one line applied?

(12) By spindrift on 2026-03-08 10:07:00 in reply to 11 [link] [source]

Good question Stephan - apologies for not being explicit.

All three of the patches in this thread seem to work without issue.

However, I have been running with Sodface's concise insertion of a zero byte into the appropriate location of the path string.

I consider it quite cheeky, and mainly just wanted him to know that I hadn't run into anything untoward.

All of the patches have been run separately for several weeks each though, and without apparent problems.

(13.1) By sodface on 2026-03-08 13:24:31 edited from 13.0 in reply to 12 [link] [source]

mainly just wanted him to know that I hadn't run into anything untoward.

Thanks for the feedback and not to be pushy with our kind hosts, but it would be nice to have this merged not only because it cleans up the url bar on redirects but I think it's more correct behavior based on the mod_dir doc I quoted in my other comment. I haven't done any searching to find a good example link served by Apache to corroborate that and I'm also not sure how other web servers handle it, so I haven't built much of case. Leaving now, but I'll see if I can find some good examples later today.

(14) By sodface on 2026-03-09 01:18:19 in reply to 13.1 [link] [source]

I guess Apache's documentation section works as an example:

Browsing to:
https://httpd.apache.org/docs/current

Redirects to:
https://httpd.apache.org/docs/current/

But this also works, indicating that the index.html filename could have been sent in the above redirect but it wasn't, only the trailing slash was:
https://httpd.apache.org/docs/current/index.html

(15) By Stephan Beal (stephan) on 2026-03-09 08:19:06 in reply to 13.1 [link] [source]

Thanks for the feedback and not to be pushy with our kind hosts, but it would be nice to have this merged not only because ...

That's now checked in to the /timeline?c=redirect-truncate branch and is running on my web server (so that makes three(?) of us). i'll prod drh again in a week or two about merging it into trunk (please don't let me forget!).

(16) By drh on 2026-03-09 11:07:42 in reply to 15 [link] [source]

Is it incorrect in some way to send the full URL? What does it matter that the full redirect URL is being sent?

(17) By sodface on 2026-03-09 16:27:53 in reply to 16 [link] [source]

I think this requested change is mostly cosmetic. Subjectively, seeing just the trailing slash in the URL is cleaner than eg. /index.html.

I haven't fully researched, but I don't think the HTTP spec defines what to do when a directory is requested.

Arguments against:

  1. It isn't broken, don't fix it
  2. Sending the full URL in the redirect allows althttpd to serve the file directly on the next request instead of having to run the file search logic again.

Arguments for:

  1. Some would say it looks nicer
  2. Behavior would be more consistent with Apache

(18) By drh on 2026-03-09 16:31:04 in reply to 17 [link] [source]

You make a compelling argument to reject this patch.

(9.1) By sodface on 2025-11-07 03:03:30 edited from 9.0 in reply to 1.2 [link] [source]

Proposition: While the redirection is required, it should only serve to add the / character, and not the entire new resource path. This is in keeping with the apparent intention in the source code comments, and also with behaviour of other web servers.

I referenced Apache's mod_dir documentation about this in another thread but it applies here and reinforces your statement above:

A "trailing slash" redirect is issued when the server receives a request for a URL http://servername/foo/dirname where dirname is a directory. Directories require a trailing slash, so mod_dir issues a redirect to http://servername/foo/dirname/

https://httpd.apache.org/docs/current/mod/mod_dir.html