
Identifying Google Analytics Referrer Spam Using R
Before beginning any analysis of Google Analytics data, it is important to clean up the referrals lists to make sure that you are only doing analysis on actual visits to your web site. Referral spam is problem that began in about 2014 when webmasters began to notice referrals in Google Analytics that did not appear in web server access logs–no one actually visited the site. Referral spam operators randomly guess Google Analytics tracking ID codes and impersonate accesses to a site in hopes that a webmaster reviewing the referral list will visit the spammers site to download malicious code or to purchase a product or service of interest to web masters. A look at Google Trends shows that this became a major problem in 2015, as shown in Figure 1.
Figure 1. Google Trends shows a dramatic uptick in searches for “referral spam” beginning in 2015.
Within Google Analytics, you can set a flag in the admin area for a view to filter out well-known referral spammers using filters described by in an article by Ben Travis at Viget. For analysis in R, you will be able to use the view filters in Google Analytics and will have to filter out the referral spammers before you do any analysis. The RGoogleAnalytics and rdomains R packages offer a programmatic way to conveniently analyze Google Analytics referral spam attacks and remove them from other Google Analytics analysis. The article is divided into the following sections:
- Retrieve Referrer Data from Google Analytics
- Use the urltools Package to Isolate the Domain Name
- Use the rdomains Package to Look up Domains on Dmoz and Shallalist, Two Domain Classification Sites
- Look up Referrer Domain Names on Virustotal, a Domain Classification Site
- Filter the Referrer List to Identify Referral Spammers
Retrieve Referrer Data from Google Analytics
The first step is to use the RGoogleAnalytics package to retrieve the referral data from Google Analytics. In the example shown, the OAuth token has been generated and saved once for use in all scripts. The query parameters are fairly general and include the ga:fullreferrer
parameter as shown in Figure 2. Make sure to set up logic to save the query data and check for existence before running the query, as the query can take a while; you may also run up against daily retrieval limits.
# # Retrieve the previously saved OAuth token # require(RGoogleAnalytics) load("~/Consulting_Business/R/working/token_file") ValidateToken(token)
## Access Token successfully updated
profiles <- GetProfiles(token)
## Access Token is valid
# # Build a list of all the Query Parameters # if (!file.exists("./ga_data_referrer")) { query.list <- Init(start.date = "2015-01-12", end.date = "2016-09-30", dimensions = "ga:date,ga:hour,ga:pagePath,ga:sourceMedium,ga:fullReferrer,ga:metro,ga:networkDomain", metrics = "ga:sessions,ga:pageviews,ga:sessionDuration,ga:bounceRate", max.results = 10000, sort = "ga:date,ga:hour", filters = "ga:medium==referral", table.id = paste("ga:",gaProfileID,sep="")) ga.query <- QueryBuilder(query.list) # Extract the data and store it in a data-framefa ga.data <- GetReportData(ga.query, token) save(ga.data,file="./ga_data_referrer") } else { load("./ga_data_referrer") }
Use the urltools Package to Isolate the Domain Name
The next step in the process is to use the tldextract
function in the urltools package to isolate the domain name in the referral string, and then filter for referrals from the root directory of a domain or try.php
, two common paths found in referral spam. In addition, any occurrence of darodar
is included in the filter, as this was one of the original referral spam domains. The code to isolate and filter domains is shown in Figure 3.
tldextract
from the urltools package.# # dply does not currently like $domain notation...thus [["domain"]] # gaDf <- ga.data gaDf <- gaDf %>% mutate(referrerDomain=paste(tldextract(urltools::domain(gaDf$fullReferrer))[["domain"]], ".", tldextract(urltools::domain(gaDf$fullReferrer))[["tld"]] ,sep="")) #gaRefSpamDf <- gaDf %>% filter(pagePath == "/" | # grepl("try.php",fullReferrer) | # grepl("darodar",fullReferrer)) %>% # group_by(referrerDomain) %>% # summarize(attacks=n()) gaRefSpamDf <- gaDf %>% group_by(referrerDomain) %>% summarize(attacks=n()) gaRefSpamDetailDf <- gaDf %>% group_by(referrerDomain,pagePath,fullReferrer) %>% summarize(attacks=n()) gaRefSpamDf
## # A tibble: 123 × 2 ## referrerDomain attacks ## <chr> <int> ## 1 100dollars-seo.com 5 ## 2 1und1.de 1 ## 3 aafes.com 2 ## 4 alhea.com 3 ## 5 alot.com 1 ## 6 aol.com 1 ## 7 asana.com 1 ## 8 ask.com 15 ## 9 atlassian.net 2 ## 10 b1.org 1 ## # ... with 113 more rows
Use rdomains Package to Look up Domains on Dmoz and Shallalist, Two Domain Classification Sites
The next step in the process is to use the rdomains package to look up the domains on various domain classification sites, beginning with dmoz and shallalist. The get_dmoz_data
and get_shalla_data
download a data set that can then be used for dmoz_cat
and shalla_cat
calls that both take a vector of domain names and return data frames with classification information, as shown in the example given in Figure 4. For purposes of identifying referrer spam domains, we will look only at domains that neither dmoz nor shallalist classify.
# # Retrieve moz and shallalist catalogs # require(rdomains)
if (!file.exists("./dmoz_domain_category.csv")) { get_dmoz_data(outdir = "./", overwrite = FALSE) } if (!file.exists("./shalla_domain_cateory.csv") && !file.exists("./shalla_domain_category.csv")) { get_shalla_data(outdir = "./", overwrite = FALSE) } # # Query the dmoz catalog # if (!file.exists("./gaRefSpam1Df")) { gaRefSpam1Df <- gaRefSpamDf %>% mutate(dmozCat = dmoz_cat(gaRefSpamDf$referrerDomain, use_file = "dmoz_domain_category.csv")$dmoz_category) %>% filter(is.na(dmozCat)) save(gaRefSpam1Df, file="./gaRefSpam1Df") } else { load("./gaRefSpam1Df") } # # Query the shallalist catalog # if (!file.exists("./gaRefSpam2Df")) { gaRefSpam2Df <- gaRefSpam1Df %>% mutate(shallaCat = shalla_cat(domains = referrerDomain)$shalla_category) %>% filter(is.na(shallaCat)) save(gaRefSpam2Df, file="./gaRefSpam2Df") } else { load("./gaRefSpam2Df") } summary(gaRefSpam2Df)
## referrerDomain attacks dmozCat shallaCat ## Length:69 Min. : 1.00 Length:69 Length:69 ## Class :character 1st Qu.: 1.00 Class :character Class :character ## Mode :character Median : 1.00 Mode :character Mode :character ## Mean : 29.61 ## 3rd Qu.: 5.00 ## Max. :1720.00
gaRefSpam2Df
## # A tibble: 69 × 4 ## referrerDomain attacks dmozCat shallaCat ## <chr> <int> <chr> <chr> ## 1 100dollars-seo.com 5 <NA> <NA> ## 2 alot.com 1 <NA> <NA> ## 3 asana.com 1 <NA> <NA> ## 4 atlassian.net 2 <NA> <NA> ## 5 basecamphq.com 1 <NA> <NA> ## 6 best-seo-offer.com 10 <NA> <NA> ## 7 best-seo-solution.com 7 <NA> <NA> ## 8 binarystream.com 1 <NA> <NA> ## 9 buttons-for-website.com 1 <NA> <NA> ## 10 buttons-for-your-website.com 6 <NA> <NA> ## # ... with 59 more rows
This list of domains still includes several that are clearly legitimate by inspection.
Look up Referrer Domain Names on Virustotal, a Domain Classification Site
As shown in the output of the code in Figure 4, by inspection, we still have a few domains that are known to be legitimate domains. To filter these out, we will go to the Virustotal service for further classification. Virustotal works somewhat differently than the other services: you must have an account and an API key, both of which are free. Calls to Virustotal are limited to four per minute, so the rdomains interface works a little differently; you cannot send a vector of domain names to the Virustotal_cat
call. To process a group of domains, you will need to write a function similar to the one shown in Figure 5.
#
# Write a function to query Virustotal for a vector of domain
# names and limit the query rate to four per minute
#
getVirustotal <- function(domainDf,VirustotalApiKey) {
require(rdomains)
require(dplyr)
if (exists("virusDomain")) {
rm(virusDomain)
}
#domainDf <- gaRefSpamDf$referrerDomain
#print(NROW(domainDf))
virusDomain <- data.frame(domain=as.character(),
bitdefender=as.character(),
dr_web=as.character(),
alexa=as.character(),
google=as.character(),
websense=as.character(),
trendmicro=as.character());
for (i in 1:NROW(domainDf)) {
#print(paste("i = ",i));
#print(paste("Domain = ",domainDf[i]));
Sys.sleep(15)
thisDomain <- Virustotal_cat(domainDf[i],apikey = VirustotalApiKey);
if (exists("thisDomain")) {
#print(paste("Domain results = ",thisDomain))
virusDomain <- merge(virusDomain,thisDomain,all=TRUE)
}
}
return(virusDomain)
}
#
# Call the function to get the Virustotal info for all domains
#
if (!file.exists("./gaRefSpam3Df")) {
gaRefSpam3Df <- getVirustotal(gaRefSpamDf$referrerDomain,VirustotalApiKey)
save(gaRefSpam3Df,file="./gaRefSpam3Df")
} else {
load("./gaRefSpam3Df")
}
gaRefSpam3Df
## domain bitdefender dr_web alexa google websense trendmicro
## 1 100dollars-seo.com <NA> <NA> <NA> uncategorized uncategorized <NA>
## 2 1und1.de hosting <NA> anbieter hosting information technology computers internet
## 3 aafes.com onlineshop <NA> military onlineshop shopping <NA>
## 4 alhea.com searchengines <NA> <NA> searchengines search engines and portals <NA>
## 5 alot.com business <NA> toolbars business society and lifestyles <NA>
## 6 aol.com computersandsoftware <NA> web_portals computersandsoftware search engines and portals search engines portals,news media
## 7 asana.com business <NA> <NA> business hosted business applications <NA>
## 8 ask.com searchengines <NA> ask searchengines search engines and portals search engines portals
## 9 atlassian.net marketing <NA> <NA> marketing educational materials <NA>
## 10 b1.org computersandsoftware not recommended site <NA> computersandsoftware information technology <NA>
## 11 basecamp.com business <NA> hosted business web collaboration computers internet
## 12 basecamphq.com business <NA> <NA> business web collaboration <NA>
## 13 best-seo-offer.com <NA> <NA> <NA> elevated exposure elevated exposure <NA>
## 14 best-seo-solution.com parked <NA> <NA> parked uncategorized <NA>
## 15 binarystream.com <NA> not recommended site <NA> information technology information technology <NA>
## 16 bing.com searchengines <NA> bing searchengines search engines and portals search engines portals
## 17 bt.com business <NA> carriers business business and economy business economy
## 18 buttons-for-website.com <NA> <NA> <NA> suspicious embedded link suspicious embedded link <NA>
## 19 buttons-for-your-website.com <NA> <NA> <NA> uncategorized uncategorized <NA>
## 20 centurylink.com business <NA> united_states business business and economy computers internet
## 21 centurylink.net portals <NA> <NA> portals news and media search engines portals
## 22 charter.net business <NA> business_and_economy business news and media news media
## 23 cincinnatibell.net business <NA> public_utilities business search engines and portals <NA>
## 24 clearch.org business <NA> <NA> business business and economy <NA>
## 25 cognizant.com business <NA> c business information technology <NA>
## 26 comcast.net news <NA> <NA> news business and economy news media
## 27 cox.com onlineshop <NA> operators onlineshop business and economy business economy
## 28 crazyguyonabike.com sports <NA> travelogues sports society and lifestyles <NA>
## 29 darodar.com parked <NA> <NA> parked suspicious content <NA>
## 30 delta-search.com searchengines not recommended site <NA> searchengines business and economy <NA>
## 31 desk.com business <NA> saas business hosted business applications blogs web communications
## 32 diigo.com computersandsoftware social networks <NA> computersandsoftware personal network storage and backup <NA>
## 33 disconnect.me education <NA> <NA> education proxy avoidance unknown
## 34 disqus.com computersandsoftware <NA> <NA> computersandsoftware information technology blogs web communications,newsgroups
## 35 dnsrsearch.com <NA> <NA> <NA> search engines and portals search engines and portals <NA>
## 36 dogpile.com searchengines not recommended site/adult content metasearch searchengines search engines and portals <NA>
## 37 duckduckgo.com searchengines <NA> search_engines searchengines search engines and portals search engines portals
## 38 earthlink.net bank e-mail united_states bank information technology <NA>
## 39 ecosia.org searchengines <NA> <NA> searchengines search engines and portals search engines portals
## 40 emailsrvr.com webmail <NA> <NA> webmail web hosting email
## 41 evernote.com computersandsoftware <NA> software computersandsoftware personal network storage and backup computers internet,personal network storage
## 42 facebook.com socialnetworks social networks we_are_the_99_percent socialnetworks social web - facebook social networking
## 43 financemarketing.com computersandsoftware <NA> <NA> computersandsoftware financial data and services <NA>
## 44 findeer.com searchengines <NA> <NA> searchengines search engines and portals <NA>
## 45 godaddy.com marketing <NA> g marketing web hosting web hosting
## 46 google.by searchengines <NA> <NA> searchengines search engines and portals search engines portals
## 47 google.ca searchengines <NA> <NA> searchengines search engines and portals search engines portals
## 48 google.co.id searchengines <NA> <NA> searchengines search engines and portals search engines portals
## 49 google.co.jp searchengines <NA> ガイドとディレクトリ searchengines search engines and portals search engines portals,reference
## 50 google.com searchengines chats google searchengines search engines and portals search engines portals
## 51 google.com.au searchengines <NA> search_engines searchengines search engines and portals search engines portals
## 52 google.com.kw searchengines <NA> <NA> searchengines search engines and portals <NA>
## 53 google.cz searchengines <NA> google searchengines search engines and portals search engines portals
## 54 google.de searchengines <NA> google searchengines search engines and portals <NA>
## 55 google.fr computersandsoftware <NA> google computersandsoftware search engines and portals search engines portals
## 56 google.it searchengines <NA> motori searchengines search engines and portals search engines portals
## 57 google.nl searchengines <NA> google searchengines search engines and portals search engines portals
## 58 hootsuite.com socialnetworks <NA> twitter socialnetworks social networking social networking
## 59 hud.gov business <NA> home business government government legal
## 60 info.com searchengines <NA> metasearch searchengines search engines and portals search engines portals
## 61 informationvine.com <NA> <NA> <NA> search engines and portals search engines and portals <NA>
## 62 isket.jp blogs <NA> <NA> blogs information technology <NA>
## 63 ixenia.com <NA> <NA> <NA> uncategorized uncategorized <NA>
## 64 ixquick.com searchengines <NA> metasearch searchengines search engines and portals search engines portals
## 65 ixquick.de business <NA> <NA> business search engines and portals search engines portals
## 66 justprofit.xyz business <NA> <NA> business elevated exposure <NA>
## 67 k9safesearch.com education <NA> <NA> education business and economy <NA>
## 68 larger.io <NA> <NA> <NA> business and economy business and economy <NA>
## 69 linkedin.com socialnetworks social networks social_networking socialnetworks social web - linkedin social networking,business economy
## 70 live.com webmail <NA> internet webmail search engines and portals search engines portals,email
## 71 locatimefree.com business <NA> <NA> business business and economy <NA>
## 72 meetup.com socialnetworks adult content/social networks social_networking socialnetworks social networking social networking,business economy
## 73 microsofttranslator.com education <NA> traductors_automàtics education reference materials translators cached pages
## 74 moz.com computersandsoftware <NA> <NA> computersandsoftware information technology computers internet
## 75 NA.NA <NA> <NA> <NA> <NA> <NA> <NA>
## 76 nextdoor.com socialnetworks social networks <NA> socialnetworks social networking <NA>
## 77 obrazky.cz marketing <NA> služby marketing search engines and portals <NA>
## 78 office365.com computersandsoftware <NA> <NA> computersandsoftware collaboration - office computers internet
## 79 office.com computersandsoftware <NA> groupware computersandsoftware collaboration - office business economy
## 80 pch.com gambling <NA> contests_and_sweepstakes gambling games <NA>
## 81 peoplepc.com computersandsoftware <NA> united_states computersandsoftware information technology <NA>
## 82 pushbullet.com marketing <NA> <NA> marketing information technology disease vector,spam
## 83 qwant.com computersandsoftware <NA> moteurs_de_recherche computersandsoftware search engines and portals <NA>
## 84 rankings-analytics.com <NA> <NA> <NA> suspicious content suspicious content <NA>
## 85 rankscanner.com blogs <NA> <NA> blogs information technology <NA>
## 86 richpasco.org <NA> <NA> <NA> uncategorized uncategorized <NA>
## 87 rof.net business <NA> <NA> business information technology <NA>
## 88 salesforce.com computersandsoftware <NA> contact_management computersandsoftware hosted business applications business economy
## 89 saltpalace.com business <NA> <NA> business business and economy <NA>
## 90 searchlock.com business <NA> <NA> business proxy avoidance <NA>
## 91 securesearch.co business <NA> <NA> business entertainment <NA>
## 92 semaltmedia.com business <NA> <NA> business uncategorized <NA>
## 93 servicepunt71.nl <NA> <NA> <NA> business and economy business and economy <NA>
## 94 seznam.cz searchengines <NA> portály searchengines search engines and portals search engines portals
## 95 shawcable.net business <NA> <NA> business business and economy <NA>
## 96 smarter.com onlineshop <NA> <NA> onlineshop shopping <NA>
## 97 social-buttons.com <NA> not recommended site <NA> uncategorized uncategorized <NA>
## 98 sosodesktop.com <NA> <NA> <NA> information technology information technology <NA>
## 99 stackoverflow.com computersandsoftware <NA> chats_and_forums computersandsoftware information technology computers internet
## 100 startjuno.com business <NA> <NA> business news and media <NA>
## 101 startnetzero.net business <NA> <NA> business news and media <NA>
## 102 startpage.com searchengines <NA> <NA> searchengines search engines and portals search engines portals
## 103 startssl.com computersandsoftware <NA> <NA> computersandsoftware business and economy internet infrastructure
## 104 success-seo.com business <NA> <NA> business uncategorized <NA>
## 105 suddenlink.net portals <NA> <NA> portals search engines and portals <NA>
## 106 t.co computersandsoftware not recommended site <NA> computersandsoftware information technology social networking
## 107 tds.net education <NA> <NA> education information technology news media
## 108 telstra.com.au business <NA> carriers business business and economy business economy
## 109 thegeekspeaks.net <NA> <NA> <NA> uncategorized uncategorized <NA>
## 110 toshiba.com business <NA> <NA> business business and economy <NA>
## 111 twcc.com <NA> <NA> <NA> entertainment entertainment <NA>
## 112 video--production.com <NA> <NA> <NA> uncategorized uncategorized <NA>
## 113 videos-for-your-business.com <NA> <NA> <NA> uncategorized uncategorized <NA>
## 114 webcrawler.com searchengines <NA> metasearch searchengines search engines and portals <NA>
## 115 web.de portals <NA> startseiten_und_portale portals search engines and portals search engines portals
## 116 webmastercentre.co.uk business <NA> <NA> business information technology <NA>
## 117 windstream.net business <NA> <NA> business search engines and portals search engines portals
## 118 wow.com games <NA> <NA> games search engines and portals <NA>
## 119 wowway.net business <NA> <NA> business information technology computers internet
## 120 xfinity.com business <NA> <NA> business business and economy <NA>
## 121 yahoo.com news <NA> web_portals news search engines and portals search engines portals
## 122 ygask.com <NA> <NA> <NA> uncategorized uncategorized <NA>
## 123 zendesk.com computersandsoftware <NA> saas computersandsoftware hosted business applications business economy
We now have a list where we can clearly identify the referral spammers using only the domain classification services.
Filter the Referrer List to Identify Referral Spammers
To filter down to the final list of referral spammers, we will use dplyr
to only include domains that are "not recommended site," "known infection site," on Dr. Web.
# # Filter on characteristics of known referral spam domains # if (!file.exists("./gaRefSpam4Df")) { gaRefSpam4Df <- gaRefSpam3Df %>% inner_join(gaRefSpam2Df,by=c("domain" = "referrerDomain")) %>% filter((is.na(dr_web) | dr_web == "not recommended site" | dr_web == "known infection source")) save(gaRefSpam4Df,file="./gaRefSpam4Df") } else { load("./gaRefSpam4Df") } gaRefSpam4Df
## domain bitdefender dr_web alexa google websense trendmicro attacks dmozCat shallaCat ## 1 100dollars-seo.com <NA> <NA> <NA> uncategorized uncategorized <NA> 5 <NA> <NA> ## 2 alot.com business <NA> toolbars business society and lifestyles <NA> 1 <NA> <NA> ## 3 asana.com business <NA> <NA> business hosted business applications <NA> 1 <NA> <NA> ## 4 atlassian.net marketing <NA> <NA> marketing educational materials <NA> 2 <NA> <NA> ## 5 basecamphq.com business <NA> <NA> business web collaboration <NA> 1 <NA> <NA> ## 6 best-seo-offer.com <NA> <NA> <NA> elevated exposure elevated exposure <NA> 10 <NA> <NA> ## 7 best-seo-solution.com parked <NA> <NA> parked uncategorized <NA> 7 <NA> <NA> ## 8 binarystream.com <NA> not recommended site <NA> information technology information technology <NA> 1 <NA> <NA> ## 9 buttons-for-website.com <NA> <NA> <NA> suspicious embedded link suspicious embedded link <NA> 1 <NA> <NA> ## 10 buttons-for-your-website.com <NA> <NA> <NA> uncategorized uncategorized <NA> 6 <NA> <NA> ## 11 centurylink.net portals <NA> <NA> portals news and media search engines portals 6 <NA> <NA> ## 12 cincinnatibell.net business <NA> public_utilities business search engines and portals <NA> 1 <NA> <NA> ## 13 clearch.org business <NA> <NA> business business and economy <NA> 1 <NA> <NA> ## 14 cognizant.com business <NA> c business information technology <NA> 1 <NA> <NA> ## 15 darodar.com parked <NA> <NA> parked suspicious content <NA> 2 <NA> <NA> ## 16 delta-search.com searchengines not recommended site <NA> searchengines business and economy <NA> 11 <NA> <NA> ## 17 desk.com business <NA> saas business hosted business applications blogs web communications 1 <NA> <NA> ## 18 disconnect.me education <NA> <NA> education proxy avoidance unknown 2 <NA> <NA> ## 19 dnsrsearch.com <NA> <NA> <NA> search engines and portals search engines and portals <NA> 1 <NA> <NA> ## 20 ecosia.org searchengines <NA> <NA> searchengines search engines and portals search engines portals 5 <NA> <NA> ## 21 evernote.com computersandsoftware <NA> software computersandsoftware personal network storage and backup computers internet,personal network storage 1 <NA> <NA> ## 22 financemarketing.com computersandsoftware <NA> <NA> computersandsoftware financial data and services <NA> 1 <NA> <NA> ## 23 findeer.com searchengines <NA> <NA> searchengines search engines and portals <NA> 1 <NA> <NA> ## 24 hud.gov business <NA> home business government government legal 2 <NA> <NA> ## 25 informationvine.com <NA> <NA> <NA> search engines and portals search engines and portals <NA> 1 <NA> <NA> ## 26 isket.jp blogs <NA> <NA> blogs information technology <NA> 7 <NA> <NA> ## 27 ixenia.com <NA> <NA> <NA> uncategorized uncategorized <NA> 1 <NA> <NA> ## 28 justprofit.xyz business <NA> <NA> business elevated exposure <NA> 2 <NA> <NA> ## 29 k9safesearch.com education <NA> <NA> education business and economy <NA> 3 <NA> <NA> ## 30 larger.io <NA> <NA> <NA> business and economy business and economy <NA> 3 <NA> <NA> ## 31 live.com webmail <NA> internet webmail search engines and portals search engines portals,email 1 <NA> <NA> ## 32 locatimefree.com business <NA> <NA> business business and economy <NA> 30 <NA> <NA> ## 33 microsofttranslator.com education <NA> traductors_automàtics education reference materials translators cached pages 1 <NA> <NA> ## 34 moz.com computersandsoftware <NA> <NA> computersandsoftware information technology computers internet 1720 <NA> <NA> ## 35 NA.NA <NA> <NA> <NA> <NA> <NA> <NA> 1 <NA> <NA> ## 36 obrazky.cz marketing <NA> služby marketing search engines and portals <NA> 1 <NA> <NA> ## 37 office365.com computersandsoftware <NA> <NA> computersandsoftware collaboration - office computers internet 1 <NA> <NA> ## 38 pushbullet.com marketing <NA> <NA> marketing information technology disease vector,spam 1 <NA> <NA> ## 39 qwant.com computersandsoftware <NA> moteurs_de_recherche computersandsoftware search engines and portals <NA> 1 <NA> <NA> ## 40 rankings-analytics.com <NA> <NA> <NA> suspicious content suspicious content <NA> 3 <NA> <NA> ## 41 rankscanner.com blogs <NA> <NA> blogs information technology <NA> 24 <NA> <NA> ## 42 richpasco.org <NA> <NA> <NA> uncategorized uncategorized <NA> 1 <NA> <NA> ## 43 salesforce.com computersandsoftware <NA> contact_management computersandsoftware hosted business applications business economy 2 <NA> <NA> ## 44 saltpalace.com business <NA> <NA> business business and economy <NA> 25 <NA> <NA> ## 45 searchlock.com business <NA> <NA> business proxy avoidance <NA> 6 <NA> <NA> ## 46 securesearch.co business <NA> <NA> business entertainment <NA> 1 <NA> <NA> ## 47 semaltmedia.com business <NA> <NA> business uncategorized <NA> 4 <NA> <NA> ## 48 servicepunt71.nl <NA> <NA> <NA> business and economy business and economy <NA> 1 <NA> <NA> ## 49 seznam.cz searchengines <NA> portály searchengines search engines and portals search engines portals 1 <NA> <NA> ## 50 shawcable.net business <NA> <NA> business business and economy <NA> 1 <NA> <NA> ## 51 social-buttons.com <NA> not recommended site <NA> uncategorized uncategorized <NA> 18 <NA> <NA> ## 52 sosodesktop.com <NA> <NA> <NA> information technology information technology <NA> 1 <NA> <NA> ## 53 startjuno.com business <NA> <NA> business news and media <NA> 1 <NA> <NA> ## 54 startnetzero.net business <NA> <NA> business news and media <NA> 1 <NA> <NA> ## 55 startssl.com computersandsoftware <NA> <NA> computersandsoftware business and economy internet infrastructure 1 <NA> <NA> ## 56 success-seo.com business <NA> <NA> business uncategorized <NA> 48 <NA> <NA> ## 57 suddenlink.net portals <NA> <NA> portals search engines and portals <NA> 3 <NA> <NA> ## 58 tds.net education <NA> <NA> education information technology news media 1 <NA> <NA> ## 59 telstra.com.au business <NA> carriers business business and economy business economy 3 <NA> <NA> ## 60 thegeekspeaks.net <NA> <NA> <NA> uncategorized uncategorized <NA> 2 <NA> <NA> ## 61 toshiba.com business <NA> <NA> business business and economy <NA> 1 <NA> <NA> ## 62 twcc.com <NA> <NA> <NA> entertainment entertainment <NA> 1 <NA> <NA> ## 63 video--production.com <NA> <NA> <NA> uncategorized uncategorized <NA> 3 <NA> <NA> ## 64 videos-for-your-business.com <NA> <NA> <NA> uncategorized uncategorized <NA> 6 <NA> <NA> ## 65 windstream.net business <NA> <NA> business search engines and portals search engines portals 2 <NA> <NA> ## 66 xfinity.com business <NA> <NA> business business and economy <NA> 29 <NA> <NA> ## 67 ygask.com <NA> <NA> <NA> uncategorized uncategorized <NA> 1 <NA> <NA>
Just looking at Dr. Web classification still gets two false positives; one for startssl.com
and one for moz.com
. It is surprising that these two are not classified by Dr. Web, but since they are classified by Trend Micro or Alexa, we can add an additional filter:
# # Filter on characteristics of known referral spam domains # if (!file.exists("./gaRefSpam5Df")) { gaRefSpam5Df <- gaRefSpam4Df %>% filter(is.na(trendmicro) & is.na(alexa)) save(gaRefSpam5Df,file="./gaRefSpam5Df") } else { load("./gaRefSpam5Df") } gaRefSpam5Df
## domain bitdefender dr_web alexa google websense trendmicro attacks dmozCat shallaCat ## 1 100dollars-seo.com <NA> <NA> <NA> uncategorized uncategorized <NA> 5 <NA> <NA> ## 2 asana.com business <NA> <NA> business hosted business applications <NA> 1 <NA> <NA> ## 3 atlassian.net marketing <NA> <NA> marketing educational materials <NA> 2 <NA> <NA> ## 4 basecamphq.com business <NA> <NA> business web collaboration <NA> 1 <NA> <NA> ## 5 best-seo-offer.com <NA> <NA> <NA> elevated exposure elevated exposure <NA> 10 <NA> <NA> ## 6 best-seo-solution.com parked <NA> <NA> parked uncategorized <NA> 7 <NA> <NA> ## 7 binarystream.com <NA> not recommended site <NA> information technology information technology <NA> 1 <NA> <NA> ## 8 buttons-for-website.com <NA> <NA> <NA> suspicious embedded link suspicious embedded link <NA> 1 <NA> <NA> ## 9 buttons-for-your-website.com <NA> <NA> <NA> uncategorized uncategorized <NA> 6 <NA> <NA> ## 10 clearch.org business <NA> <NA> business business and economy <NA> 1 <NA> <NA> ## 11 darodar.com parked <NA> <NA> parked suspicious content <NA> 2 <NA> <NA> ## 12 delta-search.com searchengines not recommended site <NA> searchengines business and economy <NA> 11 <NA> <NA> ## 13 dnsrsearch.com <NA> <NA> <NA> search engines and portals search engines and portals <NA> 1 <NA> <NA> ## 14 financemarketing.com computersandsoftware <NA> <NA> computersandsoftware financial data and services <NA> 1 <NA> <NA> ## 15 findeer.com searchengines <NA> <NA> searchengines search engines and portals <NA> 1 <NA> <NA> ## 16 informationvine.com <NA> <NA> <NA> search engines and portals search engines and portals <NA> 1 <NA> <NA> ## 17 isket.jp blogs <NA> <NA> blogs information technology <NA> 7 <NA> <NA> ## 18 ixenia.com <NA> <NA> <NA> uncategorized uncategorized <NA> 1 <NA> <NA> ## 19 justprofit.xyz business <NA> <NA> business elevated exposure <NA> 2 <NA> <NA> ## 20 k9safesearch.com education <NA> <NA> education business and economy <NA> 3 <NA> <NA> ## 21 larger.io <NA> <NA> <NA> business and economy business and economy <NA> 3 <NA> <NA> ## 22 locatimefree.com business <NA> <NA> business business and economy <NA> 30 <NA> <NA> ## 23 NA.NA <NA> <NA> <NA> <NA> <NA> <NA> 1 <NA> <NA> ## 24 rankings-analytics.com <NA> <NA> <NA> suspicious content suspicious content <NA> 3 <NA> <NA> ## 25 rankscanner.com blogs <NA> <NA> blogs information technology <NA> 24 <NA> <NA> ## 26 richpasco.org <NA> <NA> <NA> uncategorized uncategorized <NA> 1 <NA> <NA> ## 27 saltpalace.com business <NA> <NA> business business and economy <NA> 25 <NA> <NA> ## 28 searchlock.com business <NA> <NA> business proxy avoidance <NA> 6 <NA> <NA> ## 29 securesearch.co business <NA> <NA> business entertainment <NA> 1 <NA> <NA> ## 30 semaltmedia.com business <NA> <NA> business uncategorized <NA> 4 <NA> <NA> ## 31 servicepunt71.nl <NA> <NA> <NA> business and economy business and economy <NA> 1 <NA> <NA> ## 32 shawcable.net business <NA> <NA> business business and economy <NA> 1 <NA> <NA> ## 33 social-buttons.com <NA> not recommended site <NA> uncategorized uncategorized <NA> 18 <NA> <NA> ## 34 sosodesktop.com <NA> <NA> <NA> information technology information technology <NA> 1 <NA> <NA> ## 35 startjuno.com business <NA> <NA> business news and media <NA> 1 <NA> <NA> ## 36 startnetzero.net business <NA> <NA> business news and media <NA> 1 <NA> <NA> ## 37 success-seo.com business <NA> <NA> business uncategorized <NA> 48 <NA> <NA> ## 38 suddenlink.net portals <NA> <NA> portals search engines and portals <NA> 3 <NA> <NA> ## 39 thegeekspeaks.net <NA> <NA> <NA> uncategorized uncategorized <NA> 2 <NA> <NA> ## 40 toshiba.com business <NA> <NA> business business and economy <NA> 1 <NA> <NA> ## 41 twcc.com <NA> <NA> <NA> entertainment entertainment <NA> 1 <NA> <NA> ## 42 video--production.com <NA> <NA> <NA> uncategorized uncategorized <NA> 3 <NA> <NA> ## 43 videos-for-your-business.com <NA> <NA> <NA> uncategorized uncategorized <NA> 6 <NA> <NA> ## 44 xfinity.com business <NA> <NA> business business and economy <NA> 29 <NA> <NA> ## 45 ygask.com <NA> <NA> <NA> uncategorized uncategorized <NA> 1 <NA> <NA>
After investigating the domains in the list, this filter only has one clear false positive: saltpalace.com
, a convention center in Salt Lake city where I attended a conference and from which I visited my web site to make sure that it was up and running. There is really no good way to filter out this false positive using domain classification information at this point.
# # Combine with detail to see if there are other characteristics to use for filtering # if (!file.exists("./gaRefSpam6Df")) { gaRefSpam6Df <- gaRefSpam5Df %>% inner_join(gaRefSpamDetailDf,by=c("domain" = "referrerDomain")) save(gaRefSpam6Df,file="./gaRefSpam6Df") } else { load("./gaRefSpam6Df") } summary(gaRefSpam6Df)
## domain bitdefender dr_web alexa google websense trendmicro attacks.x dmozCat shallaCat pagePath fullReferrer attacks.y ## Length:76 Length:76 Length:76 Length:76 business :32 business and economy :37 Length:76 Min. : 1.00 Length:76 Length:76 Length:76 Length:76 Min. : 1.000 ## Class :character Class :character Class :character Class :character searchengines:12 uncategorized :13 Class :character 1st Qu.: 1.00 Class :character Class :character Class :character Class :character 1st Qu.: 1.000 ## Mode :character Mode :character Mode :character Mode :character uncategorized:10 information technology : 4 Mode :character Median : 6.00 Mode :character Mode :character Mode :character Mode :character Median : 1.000 ## education : 3 search engines and portals: 4 Mean :10.62 Mean : 3.684 ## marketing : 2 proxy avoidance : 3 3rd Qu.:24.25 3rd Qu.: 3.000 ## (Other) :16 (Other) :14 Max. :48.00 Max. :48.000 ## NA's : 1 NA's : 1
gaRefSpam6Df[,c("domain","pagePath","fullReferrer")]
## domain pagePath fullReferrer ## 1 100dollars-seo.com / 100dollars-seo.com/try.php ## 2 asana.com /Web-Commerce/100dollars-seo-com-referral-spam app.asana.com/0/11812954602745/44396681995798 ## 3 atlassian.net /Web-Commerce/social-buttons-com-referrer-spam lmovim.atlassian.net/browse/VDG-1 ## 4 atlassian.net /Web-Commerce/social-buttons-com-referrer-spam playhousedigital.atlassian.net/browse/TGCF-76 ## 5 basecamphq.com /Web-Commerce/social-buttons-com-referrer-spam viminteractive.basecamphq.com/projects/10027358-ee-maintenance/posts/92384734/comments ## 6 best-seo-offer.com / best-seo-offer.com/try.php ## 7 best-seo-solution.com / best-seo-solution.com/try.php ## 8 binarystream.com /Loan-Pricing/effective-yield-loan-fee-amortization crm2015.binarystream.com/_controls/emailbody/msgBody.aspx ## 9 buttons-for-website.com / buttons-for-website.com/ ## 10 buttons-for-your-website.com / buttons-for-your-website.com/ ## 11 clearch.org /Personal-and-Small-Business-Technology/using-multiple-virtual-desktops-on-windows-os-x-and-ubuntu search.clearch.org/ ## 12 darodar.com / forum.topic44008047.darodar.com/ ## 13 delta-search.com / www2.delta-search.com/ ## 14 delta-search.com /about www2.delta-search.com/ ## 15 delta-search.com /All-Articles www2.delta-search.com/ ## 16 delta-search.com /contact www2.delta-search.com/ ## 17 delta-search.com /people www2.delta-search.com/ ## 18 delta-search.com /Table/Deposit-Pricing/ www2.delta-search.com/ ## 19 delta-search.com /Table/Loan-Pricing/ www2.delta-search.com/ ## 20 delta-search.com /Table/Loan-Pricing/Charge-offs/ www2.delta-search.com/ ## 21 delta-search.com /Table/Open-Source-Software/ www2.delta-search.com/ ## 22 delta-search.com /Table/Operations-and-Information-Technology/Security/ www2.delta-search.com/ ## 23 delta-search.com /Web-Commerce/social-buttons-com-referrer-spam www2.delta-search.com/ ## 24 dnsrsearch.com /Web-Commerce/100dollars-seo-com-referral-spam dnsrsearch.com/index_results.php ## 25 financemarketing.com /Web-Commerce/social-buttons-com-referrer-spam projects.financemarketing.com/tasks/3362401 ## 26 findeer.com /Web-Commerce/traffic2cash-xyz-google-analytics-referral-spam search.findeer.com/it/results ## 27 informationvine.com /Personal-and-Small-Business-Technology/stopping-rachel-from-cardholder-services informationvine.com/index ## 28 isket.jp /Web-Commerce/social-buttons-com-referrer-spam isket.jp/seo/social-buttons-comからのリファラスパムが大量に・・・/ ## 29 ixenia.com /Web-Commerce/social-buttons-com-referrer-spam redmine.ixenia.com/issues/1143 ## 30 justprofit.xyz / justprofit.xyz/ ## 31 k9safesearch.com /Open-Source-Software/r-open-source-statistical-software k9safesearch.com/search.jsp ## 32 k9safesearch.com /Personal-and-Small-Business-Technology/using-multiple-virtual-desktops-on-windows-os-x-and-ubuntu k9safesearch.com/search.jsp ## 33 k9safesearch.com /Web-Commerce/social-buttons-com-referrer-spam k9safesearch.com/search.jsp ## 34 larger.io / larger.io/ ## 35 locatimefree.com /Web-Commerce/social-buttons-com-referrer-spam locatimefree.com/google-analytics-realtime-rapid-increase-access-referrer-spam/ ## 36 NA.NA /Web-Commerce/traffic2cash-xyz-google-analytics-referral-spam 131.253.14.125/bvsandbox.aspx ## 37 rankings-analytics.com / rankings-analytics.com/try.php ## 38 rankscanner.com / rankscanner.com/Domain/mooresoftwareservices.com ## 39 richpasco.org /Personal-and-Small-Business-Technology/stopping-rachel-from-cardholder-services richpasco.org/virus/callerid_spoofing.html ## 40 saltpalace.com / spcc-meru-guestsvc.saltpalace.com/portal/RootsTech2015/10.52.0.3/success ## 41 saltpalace.com / spcc-meru-guestsvc.saltpalace.com/portal/RootsTech2015/10.52.0.4/success ## 42 saltpalace.com /All-Articles spcc-meru-guestsvc.saltpalace.com/portal/RootsTech2015/10.52.0.3/success ## 43 saltpalace.com /Deposit-Pricing/using-time-variable-fees-to-solve-peak-workload-problems spcc-meru-guestsvc.saltpalace.com/portal/RootsTech2015/10.52.0.3/success ## 44 saltpalace.com /Table/Deposit-Pricing/ spcc-meru-guestsvc.saltpalace.com/portal/RootsTech2015/10.52.0.3/success ## 45 saltpalace.com /Table/Deposit-Pricing/ spcc-meru-guestsvc.saltpalace.com/portal/RootsTech2015/10.52.0.4/success ## 46 saltpalace.com /Table/Loan-Pricing/ spcc-meru-guestsvc.saltpalace.com/portal/RootsTech2015/10.52.0.3/success ## 47 saltpalace.com /Table/Loan-Pricing/ spcc-meru-guestsvc.saltpalace.com/portal/RootsTech2015/10.52.0.4/success ## 48 saltpalace.com /Table/Operations-and-Information-Technology/ spcc-meru-guestsvc.saltpalace.com/portal/RootsTech2015/10.52.0.3/success ## 49 saltpalace.com /Table/Operations-and-Information-Technology/Web-Commerce/ spcc-meru-guestsvc.saltpalace.com/portal/RootsTech2015/10.52.0.3/success ## 50 saltpalace.com /Table/Personal-and-Small-Business-Technology/ spcc-meru-guestsvc.saltpalace.com/portal/RootsTech2015/10.52.0.3/success ## 51 saltpalace.com /Web-Commerce/social-buttons-com-referrer-spam spcc-meru-guestsvc.saltpalace.com/portal/RootsTech2015/10.52.0.3/success ## 52 searchlock.com /Personal-and-Small-Business-Technology/stopping-rachel-from-cardholder-services searchlock.com/ ## 53 searchlock.com /Personal-and-Small-Business-Technology/windows-10-upgrade-experience-for-lenovo-w541 searchlock.com/ ## 54 searchlock.com /Web-Commerce/social-buttons-com-referrer-spam searchlock.com/ ## 55 securesearch.co /Personal-and-Small-Business-Technology/stopping-rachel-from-cardholder-services securesearch.co/search/ ## 56 semaltmedia.com / semaltmedia.com/ ## 57 servicepunt71.nl /Web-Commerce/social-buttons-com-referrer-spam webmail.servicepunt71.nl/owa/redir.aspx ## 58 shawcable.net /Personal-and-Small-Business-Technology/sales-and-lead-management-with-suitecrm wm-s.glb.shawcable.net/zimbra/mail ## 59 social-buttons.com / site34.social-buttons.com/ ## 60 sosodesktop.com /Web-Commerce/traffic2cash-xyz-google-analytics-referral-spam search.sosodesktop.com/search/web ## 61 startjuno.com /Personal-and-Small-Business-Technology/stopping-rachel-from-cardholder-services startjuno.com/search/index.php ## 62 startnetzero.net /Personal-and-Small-Business-Technology/stopping-rachel-from-cardholder-services startnetzero.net/search/index.php ## 63 success-seo.com / success-seo.com/try.php ## 64 suddenlink.net /Personal-and-Small-Business-Technology/stopping-rachel-from-cardholder-services home.suddenlink.net/search/index.php ## 65 thegeekspeaks.net /Web-Commerce/social-buttons-com-referrer-spam thegeekspeaks.net/social-buttons-com-spams-google-analytics/ ## 66 toshiba.com /Personal-and-Small-Business-Technology/stopping-rachel-from-cardholder-services home.toshiba.com/search/index.php ## 67 twcc.com /Personal-and-Small-Business-Technology/stopping-rachel-from-cardholder-services search.twcc.com/ ## 68 video--production.com / video--production.com/ ## 69 videos-for-your-business.com / 53275950.videos-for-your-business.com/ ## 70 videos-for-your-business.com / videos-for-your-business.com/ ## 71 xfinity.com /Loan-Pricing/effective-yield-loan-fee-amortization search.xfinity.com/ ## 72 xfinity.com /Personal-and-Small-Business-Technology/downloading-and-preparing-ftc-robocall-complaint-list-for-ncid search.xfinity.com/ ## 73 xfinity.com /Personal-and-Small-Business-Technology/stopping-rachel-from-cardholder-services search.xfinity.com/ ## 74 xfinity.com /Personal-and-Small-Business-Technology/using-ncid-on-two-phone-lines search.xfinity.com/ ## 75 xfinity.com /Web-Commerce/why-and-how-to-set-up-ssl-https-on-your-web-site search.xfinity.com/ ## 76 ygask.com /Loan-Pricing/effective-yield-loan-fee-amortization ygask.com/effective-interest-method-of-amortization-k.html
It looks like many of the referral spam domains reference the root page: this is the only page that will exist on all sites. Next we eliminate domains that referred to a page path other than /
.
# # Combine with detail to see if there are other characteristics to use for filtering # if (!file.exists("./gaRefSpam7Df")) { gaRefSpam7Df <- gaRefSpam6Df %>% filter(pagePath == "/" ) %>% #group_by(domain,bitdefender,dr_web,alexa,google,websense,trendmicro,dmozCat,shallaCat,attacks.x) %>% group_by(domain,dr_web,alexa,trendmicro) %>% summarize(numRefPages=n_distinct(pagePath)) #gaRefSpam7Df <- gaRefSpam7Df %>% filter(numRefPages <= 1) %>% # group_by(domain,bitdefender,dr_web,alexa,google,websense,trendmicro,dmozCat,shallaCat,attacks.x) save(gaRefSpam7Df,file="./gaRefSpam7Df") } else { load("./gaRefSpam7Df") } gaRefSpam7Df
## Source: local data frame [17 x 5] ## Groups: domain, dr_web, alexa [?] ## ## domain dr_web alexa trendmicro numRefPages ## <chr> <chr> <chr> <chr> <int> ## 1 100dollars-seo.com <NA> <NA> <NA> 1 ## 2 best-seo-offer.com <NA> <NA> <NA> 1 ## 3 best-seo-solution.com <NA> <NA> <NA> 1 ## 4 buttons-for-website.com <NA> <NA> <NA> 1 ## 5 buttons-for-your-website.com <NA> <NA> <NA> 1 ## 6 darodar.com <NA> <NA> <NA> 1 ## 7 delta-search.com not recommended site <NA> <NA> 1 ## 8 justprofit.xyz <NA> <NA> <NA> 1 ## 9 larger.io <NA> <NA> <NA> 1 ## 10 rankings-analytics.com <NA> <NA> <NA> 1 ## 11 rankscanner.com <NA> <NA> <NA> 1 ## 12 saltpalace.com <NA> <NA> <NA> 1 ## 13 semaltmedia.com <NA> <NA> <NA> 1 ## 14 social-buttons.com not recommended site <NA> <NA> 1 ## 15 success-seo.com <NA> <NA> <NA> 1 ## 16 video--production.com <NA> <NA> <NA> 1 ## 17 videos-for-your-business.com <NA> <NA> <NA> 1
Conclusions
Before doing any work in R using Google Analytics data, you must remove all of the referral spam web sites from your data; this can be done easily using the rdomains package. Because the classification data changes, it will be necessary to revisit this script on a regular basis.
- Details
- Written by Bruce Moore
- Hits: 4079

Converting to PHP 7
PHP 7 was released on December 3, 2015, but was not fully available to WHM/Cpanel Users on my hosting service until WHM 60, which formally supported Easy Apache 4 (at least at my hosting firm.) A couple of weeks ago, my hosting firm moved everyone to WHM 60, and after a lot of backups, I migrated from EasyApache 3 to EasyApache 4, and then switched my sites from PHP 5.6 to PHP 7.0. PHP 7 is supposed to reduce memory use and is claimed to be twice as fast as PHP 5.6.
EasyApache 4 was available in WHM 58, but my hosting firm did not support it at that time.
I have not benchmarked it, but all of my sites are noticably more responsive. I’ve got one problem with rendering social media icons in Opera, but I’m not sure that this is related to PHP 7, or if it was already there.
Joomla 3.5 is the first release of Joomla to support PHP 7v. Releases of Wordpress after November of last year should support PHP 7.
Testing
You should check to make sure that your PHP configuration supports you CMS and other software, especially if you need to support multiple languages. The default settings on my WHM installation, did not include iconv
, and mbstring
, both of which are necessary for Piwik.
Joomla
In Joomla, you can look at the PHP and other system settings via System->System Information; the PHP Settings tab contains all of the information on relevant PHP settings but does not flag settings that are problematic.


Piwik
To check the PHP settings for your Piwik installation, go to the admin panel and select the System Check option under the Diagnostic heading. It will check all settings and tell you of any problems as shown in Figure 3.

- Details
- Written by Bruce Moore
- Hits: 2525

The news accounts of John Podesta's email hack indicate that his password was stolen by a site that impersonated a Google login screen. The link was sent to him in a spearfishing email.
The account didn't say whether or not he had two factor authentication turned on, but he probably did not. If he had turned it on, the hackers would not have gotten the text message with the login code when they used the password the first time. He would have, and would have realized that his account had been attacked and could have taken action before anything was compromised.
Both Google and Facebook have had two factor cell phone based authentication for a few years, and many other services are starting to use it.
Joomla has had two-factor authentication since 3.2; WordPress does not appear to support it as a core function, but does have plugins to support it.
If you haven't enabled two factor authentication on your Google, Facebook and other accounts that offer it, just do it.
- Details
- Written by Bruce Moore
- Hits: 2972

Choosing Your Domain Name Server (DNS)
The election coverage recently had an article about the use of domain name service (DNS) requests by one of Donald Trump’s servers for a server at a Russian bank. While reading the article, I realized that many people probably just use the DNS servers provided by their Internet service provider (ISP) or wi-fi connection, which is not necessarily a good idea from a security perspective. Each time you surf a web page, open your email client, access the “update” function on a software product or do anything on the web, your computer makes a DNS request to translate the www.domain.com
name of a server to the 1.1.1.1
format Internet Protocol (IP) address of the server.
If you use the DNS that your ISP gives to your router, your ISP can (and will) keep track of those requests and sell information on the sites that you visit. Some ISPs have a history of having their DNS servers compromised in a DNS poisoning attack; instead of giving you the IP address of mybank.com
, the compromised DNS gives you the web site for impersonatedmybank.com
and you can’t tell the difference. If you connect to a public WiFi router, it may give a compromised DNS server so that all of your traffic gets routed to malicious addresses. The Wi-Fi routers in most venues are not well secured, so the likelihood of problems with compromised DNS is high.
I used the DNS services provided by my ISP–until I detected a compromised DNS server, at which point I manually switched all of my machines to DNS servers run by a security software provider. If you are having DNS problems, switching to Google’s servers is a sure-fire way to fix them, but recognize that Google is logging your DNS activity. Here are some common DNS servers that you might set permanently on all of your devices. Most DNS providers offer a primary and a alternate (backup) server address. The security software websites listed have instructions for changing addresses that I won’t repeat here.
You should change the DNS settings in your router so that devices that connect to your router via DHCP will get the more secure DNS servers. You should also change the DNS settings on your laptop and other devices as well.
Norton
Norton offers three pairs of DNS servers described on the Norton ConnectSafe site:
- Filtering for malware only using
199.84.126.10
and199.84.127.10
- Filtering for malware and pornography using
199.84.126.20
and199.84.127.20
- Filtering for malware, pornography and stuff that you probably do not want your children surfing using
199.84.126.30
and199.84.127.30
Comodo
Comodo, a lesser-known security software vendor in the retail world offers DNS servers described in Comodo Secure DNS. The servers are located using 8.26.56.26
and 8.20.247.20
.
OpenDNS
OpenDNS (a part of networking hardware giant Cisco) offers several free home DNS services.
Google uses 8.8.8.8
and 4.4.4.4
. These do not not provide any malware filtering, but are good for diagnosing DNS problems. Your traffic is certainly tracked, but these two are always fast. See Google Public DNS.
Summary
Changing your DNS services from the default values can provide much safer surfing. I do not get very many situations where the security software DNS stops me, but I am so, so glad when it does.
- Details
- Written by Bruce Moore
- Hits: 2705
Problems Editing Modules After Upgrade to Joomla 3.6.3
After upgrading to Joomla 3.6.3, I ran into a problem where I could not edit a module–any module. Nor could I create a new module. I tried changing editors, but that made no difference. By happenstance, on the console I noticed that a couple of articles were locked and went to unlock them. Suddenly, everything worked normally.
Firefox Problems
I got things working on one site with the fix above, but two sites still had the same error. Using Chrome instead of Firefox bypassed this problem. I then went into Firefox and disabled all extensions: this fixed the problem as well. I’m in the process of figuring out which extensions cause the problem.s
- Details
- Written by Bruce Moore
- Hits: 2362