Mobile Malware Oct 15th 2015
From Federal Burro of Information
malware in mobile
gtp-c imsi (user) emie (device) and msisdn ( phone number ) if three bg guys get thesame number, then it's probably right: BULLSHIT! are they using the same process? "created_at":"Mon Oct 12 19:42:21 +0000 2015" https://elasticsearch-py.readthedocs.org/en/master/api.html /usr/lib/python2.7/site-packages/elasticsearch/client/indices.py 478 :arg timeout: Explicit timestamp for the document _______ ML making and breaking @cchio works at shape security anomaly detection ML has seen logs of dev. why is ML not used alot? false positive and folase neg tollerance semantic gap. you get an alert and you don't know WHY it was flagged. for example you get an IP address , why is it bad? annotation? evaluation problem. it's harder to make an eval system than it is to make the system. the classical sample test is crusty and old. adversarial impact advanced actors will spend time to bypass. snow shoing? ? it is still possible to circmvent. change model with time how have AD (anomaly detection) systems failed in the past: model poinsing hard ot find attack free trianing data. libs and tools sklearn often use default parameters esp with hard deadlines is it hopeless find out by actually doing it. four steps: 1. gen time series 2. select rep features 3. train for normality 4. alert if incoming point deviate. example infrastrucure - get a copy of the one from 5 years ago. PCA anomaly detector builds a model. manual validation is required, to cover fals pos and false neg. common techniques clustrer svm neural nets subspeace correclation based how to build a model. using gets what are the X and Y selecting features, hardest part: use eye ball/ human ECA select features automatically isn't it just a parameter optimization problem? Quesiton: what platform do you use for crunching numbers. distrubuted tools? if the feature is hard to explain then it's hard to decomose the results nad put it in context. principle comp analysis auto selection of features. virtual features, synthetic features, compound feature. purely statictical. This guy is a pragmatist. produces an ordered list of dimensions. get most from your data with the least dimentions. with PCA you can find dimentions that are latent. SCREE plot the earlier the knee the fewer dimentions required to cluster your data. how to avoid common pitfalls understand your threat model. PCA not enough on it's own. keep detection scope narrow. close the semantic gap - how evaluate your AD , how well can you filter false pos. how to filer true positives image regocnition - DOD distinguish between tank and car? worked... but for the wrong reason Ha! : exactly, tracking on a correlated var, not a tank or car. speak to testing data. tanks on green , cars on roads... so tracked on road / green, not car or tank. how do we attack this problem? two ways: 1. attack leanring so it learn wrong as right 2. degrade performance of the system to compromise reliability. Chaff - set to confuse, moves the cluster. directed chaff moves center of cluster. undirected chaff weaken the "stregnth" of the clutster. "boiling frog attack" - chaff volume versus period of time. DT: connecting tdd to ml training - the behaviors that come from test suites could be used to for calibration sets, that protect against chaff attacks. a second way: decision boundry ration detection. slowly push points outward as a way to avoid the detection of chaff attacks. define deciions boundry area. DT: connect decision boundry to software versions maybe. can ML be secure? it depends. the point is to slow them down, and detect. how to defend again PCA: antidote princ compo pursuit robust PCA uses median instead of means. laplacian, gaussian , guassian is dist , also poisson my own tests: uses data set: free apache data sets. let PCA do all the owrk, to see how PCA worked. "projection into target flow" versus "??" shows naive versus robust injecting chaff showes some movement of mean . this is a decision boundry issue , not that naiv pca moved more than robust PCA. simulated trianing periods random detector - positive control for decision even iwth robust pca 38% evasion success using boiling frog attack Data mining for cyber security meetup. sparkML sklearn _________________ steathier atacks zero math zero crypto talk. @synackpse #TLSFP when finger printing: drop random length fields sessions iD deobfuscation MLsec fast flux a tls fingerprint fdr finger print defined routing - cool! fingerprint canaries: keep wrong things open and alert on connect. nation state attack: sigint. karma police. honey pots are useful, tools: fingerprintls can take pcap server_name from extension fingerprintout another tool , to print out the actual fingerprint in different formats. c snort xkeyscore output fingerprintDB github leebrotherston/tls-fingerprint openssl defaults to sslv2 nist curve highly used Leader: 0178147 0188120 frank / ops owns cost pricing meeting for what. reboot monitoring Kelley tirangalo under jason borne Partner criteria partner program - reseller motivation ______________ seim and the art of log management config operating or using security plan break nets into zone like users and like asset type what and hwere to monitro NO ONE USES ZONES ANY MORE. Comprehensive incident response plan. TONS of things the plan for. assets, zones, succesful siem team invest the monitoring and alerting. what the eff is a "siem team" sucesful seim team configure and craft alerts themselves. one big failing point : failure to keep siem in sync with net. manual process? same with software "gonna wanna reconsider everything there" monitoring ungroked data. siem team needs to be on the change board. ids on siem. you need the right hardware / software for a siem given an env it generates storage. + growth monitor your own siem HA and failover mss guys: don't know your business mss guys: subject manager poaching protection training v "get rid of human resource problem." MSS is reposnible for data , ussually they take care of it. what are the data retention , destruction , availability SLAs? deployment time is small - due to SoC how do you drop sensitive data? can also get other related services. what makes a study comprehensive. save 1/3 of siem costs by going mss _____ cymon mozdef - incident from ml binary tree model alexa good domain dga tracking cloud architecutre -.... so awesome. worker tier get config from bucket connect to sqs queue dumps results to rds web tier data tier cymon interceptor - chrome plugin