Can’t remove http:// from string (unserialized key value)

Remove https:// or http:// from a $string using PHP

I needed to remove the http:// part of a website address in order to display it correctly on a web page.

Therefore I would like;

http://www.mysite.co.uk to become just www.mysite.co.uk
https://www.this-site.com to become just www.this-site.com

 

This, although I’m still learning PHP as I go, I thought would be a small task turned out to be a lot harder than I expected. This was due to the website address string coming from a serialized array.

This task would normally be easily performed using the PHP str_ replace() function as follows

str_replace(array(‘http://’,’https://’), ”, $string);

 

Alternative methods could also be the preg_replace() or parse_url() functions.

However, whichever method I employed I could still not remove the http:// or https:// from the website url. After researching and looking at many methods online in forums and help sites I still did not come up with the easy answer to this problem. But I did come up with my work around 🙂

As mentioned the string ($string) is being retrieved from serialized data in a mysql database. It has then been unserialized into keys and then placed within a $string. As a test I performed the following to check working normally:

// $website from unserialized data - $website=$value; (https://www.this-site.com)

$url1 = $website;
$url2 = "https://www.mysite.co.uk ";

$url1 = str_replace(array('http://',"https://"), '', $url1);
$url2 = str_replace(array('http://',"https://"), '', $url2);

echo $url1; // https://www.this-site.com
echo $url2; // www.mysite.co.uk

As you can see url2 works just fine but url1 is not removing the https:// from the website address 🙁

var_dump did help to see a little more

One of the suggestions I did receive was to try the var_dump() function. And this did reveal that the string was longer that I thought it should be.

As shown below the var_dump shows the length to be (35)

var_dump($website);

string(35) "https://www.this-site.com" :

This then led me to find another function ‘htmlspecialchars()’ and all seemed to be unveiled. My str_replace was not looking for the correct characters – https:// did not exist in my $string. I don’t know why or how to covert it back but it contained the / (forward slash) as it’s HTML entity ‘/’.

echo(htmlspecialchars($website));

https://www.this-site.com

 

Finally – removing the http:// from website address

After trying the various different ways of removing text from a $string I finally settled upon this method

function cleanurl($string) {
$string = str_replace(array('https:', 'http:', '/'), '', $string);
return strtolower($string); // make lower case
} 

echo "<br>".cleanurl($website);

// now shows as www.this-site.com

What I have done is to create a function ‘cleanurl’ to replace / remove the unwanted text from my url. It first looks for the https: an removes it, if not it finds the http:, and then finally the code takes out the HTML entity ‘&#047;’ that caused all the problems.  As you can see I’ve also used strtolower() to make it lower case.

As mentioned I did try and get a more experienced answer to this problem. However as a PHP newbie I did feel a little out of my depth with some of the replies indicating that the code should already be sufficient and working correctly. After trying to explain, and as this was obviously not the case, I have had to resolve myself with a work around. Hopefully I may get some comments from the more experienced to enlighten me and provide a more efficient way but for now it does the job!

 

You may also like...