Lynxisms explored

People often ask why their version of Lynx has certain "problems" or "quirks". Sometimes the question isolates a bug in Lynx, sometimes its answer is related to how Lynx handles the anarchy of the Web. This page seeks to explain Lynx's behaviour when it encounters problems on the WWW, and in some sense, justify the actions Lynx takes.

Redirecting POST content

Thanks to Klaus Weide <kweide@tezcat.com> for his help with this section.

When you submit a form, what you have filled in is often sent to a computer program on a remote server (a cgi-script) that tries to process your input. Normally after successful processing these programs want to show you another page. There are basically two ways to do this. The normal way is for the cgi itself to send the page back to the browser/client as its output, if this is not possible, the CGI could generate a simple page with one link on it that points to the other page. The other, and problematic, way is to send what is called a 'redirect' to the browser. This is a special type of response, with a special status code and the location of the new page. Though the latter method is more convenient for page authors, it creates some problems which are a result of the different kinds of redirects available in HTTP 1.0.

Generally these forms are sending back a code 302, which stands for 'moved temporarily'. (if the server sends back a code 301 'moved permanently', Lynx will treat this essentially the same way as a 302 code). This code was designed to be used only for documents that have really "moved temporarily" and the browser is expected to treat the new location exactly as it would the former. In this particular case it would mean redirecting the POST content (the stuff you filled in the form), that is sending it again to the new location. This is usually not what CGI authors have in mind, but they use the redirection code 302 anyway because most browsers do not follow the rules when they receive this kind of redirect (and so the authors may never learn that they are doing something wrong). The HTTP 1.0 specifications also ask that the browser confirm this action with the user before actually sending the data again, and Lynx is trying to do just that. Therefore, Lynx 2.6 will ask you:

wwW: Redirection for POST content.  Proceed (y/n)?

you should reply with a No. If you let Lynx send the POST content to the new location (as it has just been asked to do), the server will usually not understand. Unfortunately, if you reply No, there is no way for you to get to the page referenced easily (short of finding the URL you were redirected to and using g)oto to go there). If you reply with a Yes, you will probably see a 501 "not implemented", or some other unexpected error may occur (like an incomplete response).

Since this incorrect use of redirection is widespread, this was a problem (in a sense punishing Lynx users for the faults of the CGI author) and it has been corrected in Lynx 2.7, which now asks:

Redirection of POST content. P)roceed, see U)RL, use G)ET, or C)ancel?

typing a G will result in Lynx requesting that the document be sent to it (i.e., it will GET the document referenced by URL). This is what most other browsers do automatically, because they do not implement the HTTP protocol correctly. If there is any doubt about this (almost no one uses redirects POST content correctly, so there generally isn't any doubt), you should type an U to see the URL. If it looks like it's a CGI (it may have 'cgi' or 'cgi-bin' in the path somewhere) use P and redirect the POST, otherwise use GET.

By now you're asking yourself why this was ever done. Can't Lynx just behave like all the other browsers and follow the 'de facto' rules? Well we wish it were so easy. The original problem is with HTTP 1.0 which failed to define a code for CGIs that would produce output that sent someone to another page (a directory service for example, that used a database of names and URLs). HTTP 1.1 corrects this, there is a new code 303 "see other" which Lynx implements. Other browsers also seem to handle this code correctly. When HTTP 1.1 is widely deployed (probably within the next few months), the problem will resolve itself as the new HTTP servers handle this situation intelligently. Meanwhile the Lynx authors have attempted to follow the specifications to the greatest extent possible, refusing to use a function for a purpose it wasn't designed to serve since in the long run this would simply exacerbate our problems.

References

Bad HTML in forms

Sometimes you'll see a "** Bad HTML!! Use trace to diagnose." message while you're cruising merrily along the Web. Often this occurs with HTML forms. The message means exactly what it means, the HTML in the form is incorrect. Generally, you will not be able to use such a form, and when you try to submit whatever you may have filled in, you'll see a "** Bad HTML!! No form action defined!". If you know some HTML, try to go through one of these forms, (you can use \ to view the source). You'll see that the tags are badly nested together or that there are unecessary end tags (like extra </ul> for example). The end result is that Lynx closes the form early and when it gets to the submit button, it no longer knows that this markup was supposed to refer to the form which was closed earlier.

You might want to contact the author of the page and point out the bad markup, gently reminding them that it might "work" on their browser, but could be completely wrong, and that they should consider using an HTML validator to check that their markup is correct.

Older versions of Lynx sometimes don't have a problem handling such forms. This is because they ignore tags they know nothing about, and most often the illegal nesting occurs with "new" tags like TABLE and FONT.

References

Lynx's comment handling

Sometimes you'll see a blank page when you know the document has text in it. Generally this is because the author tried to put in HTML comment tags, but did them correctly. The correct way to do HTML comments in a HTML document is:

<!--the actual comment text here-->

The comment begins with the characters <!-- and ends with --> (at least the simple ones that most authors want). Some authors fail to use the correct syntax and use just a > to close tags for example. This is incorrect markup (historically the result of a bug in a popular version of Netscape, which was fixed in subsequent versions). If you wish to view a page with such a bad comment, simply type an apostrophe (') and Lynx will reload the page using a little less stringent criterion (Historical) to close comment tags rather than valid (minimal) comment parsing method. Lynx will continue to use the less stringent criterion for that session. You can make turn the default comment parsing to Historical (less stringent) via a compile time option, or using your lynx.cfg. For large sites with novice users, defaulting to Historical comment parsing is recommended.

References

Cookie implementation

Lynx 2.7 contains partial implementation for cookies. The most important thing to remember about the implementation of cookies in Lynx is that it is single session support only. This means any cookies you get, or any actions you take on cookies will not last beyond the current session. Similarly, you cannot use your cookie jar for Netscape (or use your Netscape cookies for Lynx), partly because different versions of Netscape store and handle cookies in different ways.

You'll also notice that Lynx often asks you whether to accept cookies or not. It gives you 4 options, Yes, No, Always and neVer. If you answer Y, Lynx will accept this particular cookie, if you answer No Lynx will refuse this particular cookie. Picking Always means Lynx will accept all cookies from that domain within this session, pick neVer and Lynx will refuse all cookies from that domain within the session.

Cookies provide a means for browsers and web servers to interact with each other in ways that permit them to offer many services. A cookie is just a bunch of text that the server hands over to the browser (in this case Lynx) with a request to store it so that it can be used later to provide information about actions taken in the past. For example, cookies might be used to create a virtual shooping cart, so that as you go through a virtual store, the web server hands your browser cookies for every object you put in your cart. When you go to the counter, the seerver asks Lynx to send back all the cookies, uses them to calculate what you've bought, and perhaps lets you pay for those items (NOTE: this is a hypothetical, though highly illustrative example). There are obviously concerns about the use of cookies, which can also be used to track your movements through a site (cookies from one site are not permitted to be sent to another web-site), information you may not want to give away. Future versions of Lynx will provide for better and fuller support for cookies as the Cookie protocol itself eveloves.

References


Lynx links | *Subir Grewal | why-does-lynx@trill-home.com