Document Link Database

<%- if @topic_view.topic.tags.present? %>
<%= t 'js.tagging.tags' %>: <%- @topic_view.topic.tags.each do |t| %> <%= t %> <%- end %>
<% end %>

A customer asked me to come up with a “document link database”.

They have several thousand documents distributed across various nationally based sites and the idea is that each document will have a SKU and no matter what happens, the href links will always be correct. Or anyway, as correct as humanly possible.

So I came up with a table that contains the SKU, the link and where it should take you. This is still SS3, btw.

class RelocatorLink extends DataObject{
private static $db = array(
“DocumentSKU” => “Varchar(200)”,
“Link” => “Varchar(200)”,
“RelocatesTo” => “Varchar(200)”
);

private static $summary_fields = array(
    "DocumentSKU", "Link", "RelocatesTo"
);

}

the pages will have a script that goes like

jQuery.noConflict();
(function($) {

$("a").click(function(e){
    e.preventDefault();

    const $href = $(this).attr("href");

    var params = "link=" + $(this).attr("href");
    var xhttp = new XMLHttpRequest();

    xhttp.onreadystatechange = function() {
        if (this.readyState == 4 && this.status == 200) {
            if(this.responseText != "") {
                window.location.href = this.responseText;
            }
            else{
                window.location.href= $href;
            }
        }

    };
    xhttp.open("POST", home/findAndReturnLink" + "?" + params, true);
    xhttp.send();
});

}(jQuery));

and a function that goes like

function findAndReturnLink(){
    $link = $_REQUEST["link"];

    $res = RelocatorLink::get()->filter(array("Link" => $link))->first();
    if($res) {
        return $res->RelocatesTo;
    }
    else{
        return false;
    }

}

So basically, you click on a link, the script calls a function that searches the table, if it finds the record, it returns the redirect, otherwise the link works as normal.

File security is handled by the remote site so the visitor may be taken to some national site where they may have to log in to access the documents.

I know there are things to be added, such as making the sure the SKU is unique and some sort of message in case Javascript is blocked on the visitor’s browser (the site runs Bootstrap so it should have Javascript enabled and JQuery loaded.

But I can’t help but feel uneasy. As a solution, it seems so simple that I am almost certain it must be risky or dangerous in some way.

I’m not asking Silverstripe to certify this or do my work for me but is there something obviously wrong with this solution?

Thanks for reading this far.

You’ll likely have issues with special characters in query strings, being escaped using when Silverstripe tries to sanitise your input. I don’t know the specifics, but roughly:

Say a document link is:
http://www.something.com/x/y/z/?document='123'

When this happens:
$res = RelocatorLink::get()->filter(array("Link" => $link))->first();

$link will get escaped and you’ll end up searching for something like:
http\/\/www.something.com\/x\/y\/z\/\?document=\'123\'

You probably need to create an index for each item based on the $link.
This might help: PHP: urlencode - Manual

Also see:

Hmmm… hadn’t thought of that. Ta. :slight_smile:

I don’t think this is accurate. SilverStripe automatically escapes things some times, including raw values to xml in templates, and escaping will be applied as necessary to produce valid SQL. But if the value of $link is the string http://www.something.com/x/y/z/?document='123', that value is what will be searched for in the database in this case.

tbh, I haven’t had a go at it yet, but it would be easy enough to feed the link in James’ answer to the function and see what SS does with it.

None of the customer’s links are as complicated as that but it’s good to be prepared.

I’ll be interested to know what _Vince’s findings are. I’ve always assumed that once a string is escaped, for the purposes of generating valid SQL, the resulting value passed to Mysql would be the escaped string.

e.g 'Some"thing' would end up in the SQL query as 'Some\"thing'

Also, given the number of special characters used in URL’s, there’ll surely be issues. It would make sense to at least encode any stored and incoming URL’s and use them as indexes. I think.

EDIT: I realise you’re saying that SilverStripe may not necessarilly escape characters in the string, which is the crux of the question.

Just to play devil’s advocate for a second… is this the best solution?

You’re essentially just dealing with a bunch of redirects for some thousands of files. At the moment you’re talking about bootstrapping an entire PHP framework and running database queries for each and every file request that comes in.

I think I would maybe investigate ways to do this more statically. You could still use the CMS as an editing system, but publishing the redirects statically when there’s a change would make a lot more sense from a performance and resilience point to view.

Apache is pretty quick at dealing with redirects (although a massive htaccess file might not be very nice). Nginx can be configured for redirects. You could even set up a javascript worker (such as Cloudflare provides) with a simple, fast K-V backend to handle it.

Just a thought :slight_smile:

I had similar thoughts and CloudFlare workers jumped to mind for me as well.

Also not sure if it’s possible for your requirements @_Vince but I would lose the javascript component and essentially create your own URL shortener service (like Bitly). So you could setup a subdomain that points to CloudFlare Workers or similar, with links like https://documents.mydomain.com/[DocumentSKU]

You can still have the same logic, but you could use those links anywhere, without javascript.

Plus, with this approach you could set the whole thing up fairly quickly in SilverStripe alone, then if performance turns out to be a problem you could keep the CMS for administration but implement the static redirect or worker idea for the actual redirects.

Thanks people! I didn’t ghost, I just got sidetracked on other stuff while the original client was on holidays.

Some great ideas here, thank you all very much.

Thanks for your good guidance