Gcp Notes: Difference between revisions
No edit summary |
|||
(53 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
== Overview == | == Overview == | ||
== Auth == | |||
get the auth file and then: | |||
export GOOGLE_APPLICATION_CREDENTIALS="/usr/home/user/.gcp/XXX-XXX.json" | |||
whoami ? | |||
gcloud auth list | |||
gsutil version -l | |||
The gsutil will show legacy boto files: | |||
${HOME}/.config/gcloud/legacy_credentials/david.thornton@domain.com/.boto | |||
but in the same dir there is: | |||
${HOME}/.config/gcloud/legacy_credentials/david.thornton@domain.com/adc.json | |||
which you can put in the GOOGLE_APPLICATION_CREDENTIALS env var. | |||
There a couple of env vars, it's not clear when to use which one. It's a bit all over the place. At this time GOOGLE_APPLICATION_CREDENTIALS works in the most places I care about ( terraform ) | |||
echo ${GOOGLE_CREDENTIALS} | |||
echo ${GOOGLE_CLOUD_KEYFILE_JSON} | |||
echo ${GCLOUD_KEYFILE_JSON} | |||
== Projects == | |||
A logical place to put your stuff. | |||
Use this "bag" as a billing unit. | |||
In as much as you want to use label for billing, some charges can't be labels. Project partition that cost. | |||
list your projects: | |||
gcloud projects list | |||
Not all project have billing accounts. | |||
; Labeling and filters and formatting at the command line: | |||
: https://cloud.google.com/blog/products/gcp/filtering-and-formatting-fun-with | |||
== Storage == | == Storage == | ||
Line 6: | Line 52: | ||
https://cloud.google.com/storage-options/ | https://cloud.google.com/storage-options/ | ||
== Outputs == | |||
extra one label for a coloumn in a table out put: | |||
gcloud pubsub topics list --format="table[box](name:sort=1,labels[product])" | |||
<pre> | |||
┌───────────────────────────────────────────┬──────────┐ | |||
│ NAME │ PRODUCT │ | |||
├───────────────────────────────────────────┼──────────┤ | |||
│ projects/($project)/topics/(topic name ) │ (product)│ | |||
│ ... │ ... │ | |||
└───────────────────────────────────────────┴──────────┘ | |||
</pre> | |||
== Formating Output == | |||
Here is an example of the way in which things in the api returned response can be formatted for display in a table: | |||
<pre> | |||
(gcloud.artifacts.repositories.list) Expected ) in projection expression [ table[title="ARTIFACT_REGISTRY"]( | |||
name.basename():label=REPOSITORY, | |||
format:label=FORMAT, | |||
mode.basename(undefined=STANDARD_REPOSITORY):label=MODE, | |||
description:label=DESCRIPTION, | |||
name.segment(3):label=LOCATION, | |||
labels.list():label=LABELS, | |||
kmsKeyName.yesno(yes='Customer-managed key', no='Google-managed key'):label=ENCRYPTION, | |||
createTime.date(tz=LOCAL), | |||
updateTime.date(tz=LOCAL), | |||
sizeBytes.size(zero='0',precision=3,units_out=M):label="SIZE (MB)" | |||
) table[](name,cleanupPolicyDryRun,sizeBytes *HERE* /1000000000)]. | |||
</pre> | |||
== Compute == | == Compute == | ||
Line 11: | Line 92: | ||
https://cloud.google.com/sdk/gcloud/reference/compute/instances/create | https://cloud.google.com/sdk/gcloud/reference/compute/instances/create | ||
how do I like project and familiy for well known images for terraform builds? | |||
gcloud compute images list --standard-images | |||
=== list non-running instances === | |||
gcloud compute instances list | |||
_always_ show _only_ running. | |||
But what about the failed, initializing, terminated instances? Try this: | |||
gcloud compute instances list --filter="status:*" | |||
on scaling: | |||
https://cloud.google.com/compute/docs/autoscaler/understanding-autoscaler-decisions | |||
( Log still don't tell you on what metric is used to decide to scale. ) | |||
=== List instances lit up with a template === | |||
gcloud compute instances list --filter="metadata.items.list(show="keys"):instance-template" | |||
or list name and template: | |||
gcloud compute instances list --format='value[](name,metadata.items.instance-template)' | |||
=== list template with metadata item X === | |||
gcloud compute instance-templates list \ | |||
--format='value[](name,properties.metadata.items.X)' | |||
=== Port forwarding via gcloud === | |||
connect to the locally running port: | |||
gcloud compute ssh --zone us-central1-c <<instancename>> --verbosity=info -- -NL 4545:localhost:4545 | |||
now connect to localhost:4545 and get the cloud instance. | |||
=== OS Login === | |||
So you want to just ssh into the vm like you do everything else, you don't want to use | |||
[https://cloud.google.com/sdk/gcloud/reference/compute/ssh gcloud compute ssh] ... | |||
or the "in browser" ssh client. | |||
great , you want "OS Login" | |||
lots of steps: | |||
1. for the VM set the enable-oslogin meta data value to "TRUE" | |||
in tf like this: | |||
<pre> | |||
metadata = { | |||
enable-oslogin = "TRUE" | |||
} | |||
</pre> | |||
2. give the user the correct roles: | |||
Computer OS Login ( for vanilla , non-root access ) | |||
Compute OS Admin login ( for root access via sudo ) | |||
via command line this I think: | |||
<pre> | |||
gcloud projects add-iam-policy-binding project-ID --member \ | |||
serviceAccount:"USERNAME@project-ID.iam.gserviceaccount.com" \ | |||
--role "roles/iam.serviceAccountUser" | |||
--no-user-output-enabled --quiet | |||
</pre> | |||
=== loop over all hosts === | |||
get a list of names and zones: | |||
<pre> | |||
gcloud compute instances list --filter="labels.label_name=label_value" --format="table[no-heading](name,zone)" > ~/tmp/instancelist | |||
</pre> | |||
compute login needs the vm's zone. | |||
<pre> | |||
while read name zone | |||
do | |||
echo gcloud compute ssh --zone \"$zone\" \"$name\" --project \"${project}\" --command=\"important command\" | |||
done < ~/tmp/instancelist > ~/tmp/out.sh | |||
</pre> | |||
then sh out.sh | |||
=== Start up script basic web sewrver === | |||
[[/basic-web-server-startup.sh]] | |||
== Netowrking == | |||
packet loss query: | |||
<pre> | |||
fetch gce_zone_network_health :: networking.googleapis.com/cloud_netslo/active_probing/probe_count | | |||
map add[pd_local_zone: resource.zone, pd_remote_zone: metric.remote_zone, pd_local_region: resource.region, pd_remote_region: metric.remote_region] | | |||
filter (pd_local_region =~ 'us-central1') | | |||
filter (pd_remote_region =~ 'us-central1') | | |||
{ | |||
filter metric.result = 'failure' | | |||
group_by [resource.zone, metric.remote_zone], 14400s, .sum ; group_by [resource.zone, metric.remote_zone], 14400s, .sum | | |||
filter val() >= 1 '1' | |||
} | | |||
ratio | | |||
top 50, .mean | |||
</pre> | |||
from "Network Intelligence" -> "Performance Dashboard" -> "Packet Loss" | |||
== How Tos == | == How Tos == | ||
Line 16: | Line 220: | ||
;single node NFS | ;single node NFS | ||
:https://medium.com/google-cloud/gke-with-google-cloud-single-node-filer-nfs-4c4dc569964f | :https://medium.com/google-cloud/gke-with-google-cloud-single-node-filer-nfs-4c4dc569964f | ||
== Annoyances == | |||
1. 'gcloud compute ssh' requires that you know what zone the instance is in. | |||
2. compute instances name change is destructive. Change the name? destroy and recreate :( | |||
3. Web gui search bar "redis" doesn't return "memorystore" ( gcp branded redis ) - has been FIXED as of 2021 | |||
4. Web gui search bar "memorystore" yields no results. - has been FIXED as of 2021 | |||
5. Web gui search bar quite slow. - has been FIXED as of 2021 | |||
6. Web gui Load Balancers: default view not good for professionals. you must always go to "advanced" ( small text at the bottom of the list ). No way to configure it to always go to advanced. | |||
7. Web GUI load balancer, hard to "see" internal load balancers. | |||
8. Instance log by their ID. You cannot search by instance name in logs. To map a name to an ID search this: | |||
<pre> | |||
resource.type="gce_instance" | |||
protoPayload.request.name="INSTANCE_NAME" | |||
protoPayload.methodName="v1.compute.instances.insert" | |||
</pre> | |||
Then look in the protoPayload.response.id field. Then change your search to: | |||
<pre> | |||
resource.type="gce_instance" | |||
protoPayload.response.id="INSTANCEID" | |||
</pre> | |||
sigh. | |||
== Gotchas == | |||
=== scripting === | |||
you are making a GetClusterRequest object in your python ... but api docs , which version are you using? | |||
https://cloud.google.com/python/docs/reference/dataproc/latest/google.cloud.dataproc_v1.types.GetClusterRequest | |||
don't do this: | |||
<pre> | |||
request = container_v1.GetClusterRequest( | |||
project_id=myproject, | |||
region=myregion, | |||
cluster_id=mycluster | |||
) | |||
</pre> | |||
do this: | |||
<pre> | |||
request = container_v1.GetClusterRequest( | |||
name=f"projects/{myproject}/locations/{region}/clusters/{mycluster}" | |||
) | |||
</pre> | |||
=== trraform intenral lb exanple === | |||
https://cloud.google.com/load-balancing/docs/l7-internal/int-https-lb-tf-examples | |||
== Logging == | |||
=== Queries of note === | |||
<pre> | |||
-resource.type="k8s_cluster" | |||
-resource.type="http_load_balancer" | |||
-resource.type="vpn_gateway" | |||
-resource.type="gce_instance" | |||
-resource.type="cloudsql_database" | |||
-resource.type="pubsub_topic" | |||
-resource.type="pubsub_subscription" | |||
-resource.type="stackdriver_notification_channel" | |||
-resource.type="audited_resource" | |||
resource.type="gce_instance_group" | |||
</pre> | |||
are you over quota somewhre? | |||
There is a quota dashbarod, but what about the log: | |||
protoPayload.status.message="QUOTA_EXCEEDED" | |||
=== Load Balancer Request logging === | |||
Which of your back end have logging turned on, and what is their sample rate? | |||
gcloud compute backend-services list --format json | \ | |||
jq '.[] | { name: .name , enable: .logConfig.enable , samplerate: .logConfig.sampleRate}| join(",")' | |||
=== Getting logs with Python === | |||
Sometimes you just can't get what you want from the UI. | |||
So do it yourself: | |||
<pre> | |||
#!/usr/bin/env python3 | |||
# use: | |||
# 'gcloud logging logs list' | |||
# to list availble logger_names. | |||
import json | |||
from google.cloud import logging | |||
import pprint | |||
logger_name = "compute.googleapis.com%2Fhealthchecks" | |||
logging_client = logging.Client() | |||
logger = logging_client.logger(logger_name) | |||
for entry in logger.list_entries(): | |||
timestamp = entry.timestamp.isoformat() | |||
# uncomment this to figure out the fiels you want. | |||
# pprint.pprint(entry) | |||
print("{} {}".format( entry.payload['healthCheckProbeResult']['probeSourceIp'],entry.payload['healthCheckProbeResult']['healthState'])) | |||
</pre> | |||
In this case I could not get log metrics. | |||
Reference: https://googleapis.dev/python/logging/latest/entries.html | |||
== Log Metrics == | |||
=== Load balancer requests === | |||
In this example we are asking: | |||
Which grafana dashboards are being used the most? | |||
First setup a logging metric for l7 load balancer, then: | |||
<pre> | |||
fetch l7_lb_rule | |||
| metric 'logging.googleapis.com/user/lblogmetrics' | |||
| filter | |||
(resource.forwarding_rule_name | |||
== 'k8s-fws-cluster-monitoring-grafana--XXX') | |||
&& | |||
(metric.requesturl | |||
=~ 'https://grafana.domain.com/d/.*') | |||
| align rate(1m) | |||
| every 1m | |||
| group_by [re_extract(metric.requesturl, "^https://grafana.domain.com/d/(.*)\\?", r'\1')], | |||
[value_lblogmetrics_aggregate: aggregate(value.lblogmetrics)] | |||
</pre> | |||
Here I've relabels "re_extract..." as "url" | |||
<pre> | |||
fetch l7_lb_rule | |||
| metric 'logging.googleapis.com/user/lblogmetrics' | |||
| filter | |||
(resource.forwarding_rule_name | |||
== 'k8s-fws-cluster-monitoring-grafana--XXX') | |||
&& | |||
(metric.requesturl | |||
=~ 'https://grafana.domain.com/d/.*') | |||
| align rate(1m) | |||
| every 1m | |||
| group_by [ url: re_extract(metric.requesturl, "^https://grafana.domain.com/d/(.*)\\?", r'\1')], | |||
[value_lblogmetrics_aggregate: aggregate(value.lblogmetrics)] | |||
</pre> | |||
''Note'' the re_extract function in the group by, for group just the part of the request you care about. some request urls can get quite long, such that they don't fit in graphs well. | |||
You can do this sort of slice and dice for any site in GCP. | |||
Be careful with logging. For very busy sites google logging might be expensive, you can also do logging ratios to get a feel for activity without logging _every_ request. | |||
== Monitoring and Alerting == | |||
=== Overquota events === | |||
https://cloud.google.com/monitoring/alerts/using-quota-metrics | |||
== Find orphaned instances == | |||
i.e. instances not owned by a mig are orphans | |||
<pre> | |||
#!/bin/sh | |||
# A. list instance groups | |||
# B. list all instance in instance group | |||
# C. list all instance | |||
# D. substract one from the other. | |||
echo A. gcloud compute instance-groups list | |||
export INSTANCE_GROUPS=$(mktemp) | |||
echo $INSTANCE_GROUPS | |||
gcloud compute instance-groups list | grep region | grep -v NAME | awk '{print $1}' > ${INSTANCE_GROUPS} | |||
echo B. gcloud compute instance-groups managed list-instances | |||
export NOT_ORPHANS=$(mktemp) | |||
for i in `cat ${INSTANCE_GROUPS}` | |||
do | |||
gcloud compute instance-groups managed list-instances $i --region us-central1 | grep -v NAME >> ${NOT_ORPHANS} | |||
done | |||
echo C. list all instance | |||
ALL_INSTANCES=$(mktemp) | |||
gcloud compute instances list | awk '{print $1}' > ${ALL_INSTANCES} | |||
for i in `cat ${ALL_INSTANCES}` | |||
do | |||
grep -q $i ${NOT_ORPHANS} || echo orphan $i | |||
done | |||
</pre> | |||
gke instance might show up in this. | |||
== Survey == | |||
VMs | |||
Redis - "memorystore" | |||
SQL | |||
== Also see == | |||
[[/gcp cloudevents]] | |||
== Reading == | |||
;Hashes and ETags: Best Practices | |||
:https://cloud.google.com/storage/docs/hashes-etags |
Latest revision as of 20:21, 28 October 2024
Overview
Auth
get the auth file and then:
export GOOGLE_APPLICATION_CREDENTIALS="/usr/home/user/.gcp/XXX-XXX.json"
whoami ?
gcloud auth list
gsutil version -l
The gsutil will show legacy boto files:
${HOME}/.config/gcloud/legacy_credentials/david.thornton@domain.com/.boto
but in the same dir there is:
${HOME}/.config/gcloud/legacy_credentials/david.thornton@domain.com/adc.json
which you can put in the GOOGLE_APPLICATION_CREDENTIALS env var.
There a couple of env vars, it's not clear when to use which one. It's a bit all over the place. At this time GOOGLE_APPLICATION_CREDENTIALS works in the most places I care about ( terraform )
echo ${GOOGLE_CREDENTIALS} echo ${GOOGLE_CLOUD_KEYFILE_JSON} echo ${GCLOUD_KEYFILE_JSON}
Projects
A logical place to put your stuff.
Use this "bag" as a billing unit.
In as much as you want to use label for billing, some charges can't be labels. Project partition that cost.
list your projects:
gcloud projects list
Not all project have billing accounts.
- Labeling and filters and formatting at the command line
- https://cloud.google.com/blog/products/gcp/filtering-and-formatting-fun-with
Storage
Types of storage, how to choose:
https://cloud.google.com/storage-options/
Outputs
extra one label for a coloumn in a table out put:
gcloud pubsub topics list --format="table[box](name:sort=1,labels[product])"
┌───────────────────────────────────────────┬──────────┐ │ NAME │ PRODUCT │ ├───────────────────────────────────────────┼──────────┤ │ projects/($project)/topics/(topic name ) │ (product)│ │ ... │ ... │ └───────────────────────────────────────────┴──────────┘
Formating Output
Here is an example of the way in which things in the api returned response can be formatted for display in a table:
(gcloud.artifacts.repositories.list) Expected ) in projection expression [ table[title="ARTIFACT_REGISTRY"]( name.basename():label=REPOSITORY, format:label=FORMAT, mode.basename(undefined=STANDARD_REPOSITORY):label=MODE, description:label=DESCRIPTION, name.segment(3):label=LOCATION, labels.list():label=LABELS, kmsKeyName.yesno(yes='Customer-managed key', no='Google-managed key'):label=ENCRYPTION, createTime.date(tz=LOCAL), updateTime.date(tz=LOCAL), sizeBytes.size(zero='0',precision=3,units_out=M):label="SIZE (MB)" ) table[](name,cleanupPolicyDryRun,sizeBytes *HERE* /1000000000)].
Compute
https://cloud.google.com/sdk/gcloud/reference/compute/instances/create
how do I like project and familiy for well known images for terraform builds?
gcloud compute images list --standard-images
list non-running instances
gcloud compute instances list
_always_ show _only_ running.
But what about the failed, initializing, terminated instances? Try this:
gcloud compute instances list --filter="status:*"
on scaling:
https://cloud.google.com/compute/docs/autoscaler/understanding-autoscaler-decisions
( Log still don't tell you on what metric is used to decide to scale. )
List instances lit up with a template
gcloud compute instances list --filter="metadata.items.list(show="keys"):instance-template"
or list name and template:
gcloud compute instances list --format='value[](name,metadata.items.instance-template)'
list template with metadata item X
gcloud compute instance-templates list \ --format='value[](name,properties.metadata.items.X)'
Port forwarding via gcloud
connect to the locally running port:
gcloud compute ssh --zone us-central1-c <<instancename>> --verbosity=info -- -NL 4545:localhost:4545
now connect to localhost:4545 and get the cloud instance.
OS Login
So you want to just ssh into the vm like you do everything else, you don't want to use
gcloud compute ssh ...
or the "in browser" ssh client.
great , you want "OS Login"
lots of steps:
1. for the VM set the enable-oslogin meta data value to "TRUE"
in tf like this:
metadata = { enable-oslogin = "TRUE" }
2. give the user the correct roles:
Computer OS Login ( for vanilla , non-root access ) Compute OS Admin login ( for root access via sudo )
via command line this I think:
gcloud projects add-iam-policy-binding project-ID --member \ serviceAccount:"USERNAME@project-ID.iam.gserviceaccount.com" \ --role "roles/iam.serviceAccountUser" --no-user-output-enabled --quiet
loop over all hosts
get a list of names and zones:
gcloud compute instances list --filter="labels.label_name=label_value" --format="table[no-heading](name,zone)" > ~/tmp/instancelist
compute login needs the vm's zone.
while read name zone do echo gcloud compute ssh --zone \"$zone\" \"$name\" --project \"${project}\" --command=\"important command\" done < ~/tmp/instancelist > ~/tmp/out.sh
then sh out.sh
Start up script basic web sewrver
Netowrking
packet loss query:
fetch gce_zone_network_health :: networking.googleapis.com/cloud_netslo/active_probing/probe_count | map add[pd_local_zone: resource.zone, pd_remote_zone: metric.remote_zone, pd_local_region: resource.region, pd_remote_region: metric.remote_region] | filter (pd_local_region =~ 'us-central1') | filter (pd_remote_region =~ 'us-central1') | { filter metric.result = 'failure' | group_by [resource.zone, metric.remote_zone], 14400s, .sum ; group_by [resource.zone, metric.remote_zone], 14400s, .sum | filter val() >= 1 '1' } | ratio | top 50, .mean
from "Network Intelligence" -> "Performance Dashboard" -> "Packet Loss"
How Tos
- single node NFS
- https://medium.com/google-cloud/gke-with-google-cloud-single-node-filer-nfs-4c4dc569964f
Annoyances
1. 'gcloud compute ssh' requires that you know what zone the instance is in.
2. compute instances name change is destructive. Change the name? destroy and recreate :(
3. Web gui search bar "redis" doesn't return "memorystore" ( gcp branded redis ) - has been FIXED as of 2021
4. Web gui search bar "memorystore" yields no results. - has been FIXED as of 2021
5. Web gui search bar quite slow. - has been FIXED as of 2021
6. Web gui Load Balancers: default view not good for professionals. you must always go to "advanced" ( small text at the bottom of the list ). No way to configure it to always go to advanced.
7. Web GUI load balancer, hard to "see" internal load balancers.
8. Instance log by their ID. You cannot search by instance name in logs. To map a name to an ID search this:
resource.type="gce_instance" protoPayload.request.name="INSTANCE_NAME" protoPayload.methodName="v1.compute.instances.insert"
Then look in the protoPayload.response.id field. Then change your search to:
resource.type="gce_instance" protoPayload.response.id="INSTANCEID"
sigh.
Gotchas
scripting
you are making a GetClusterRequest object in your python ... but api docs , which version are you using?
don't do this:
request = container_v1.GetClusterRequest( project_id=myproject, region=myregion, cluster_id=mycluster )
do this:
request = container_v1.GetClusterRequest( name=f"projects/{myproject}/locations/{region}/clusters/{mycluster}" )
trraform intenral lb exanple
https://cloud.google.com/load-balancing/docs/l7-internal/int-https-lb-tf-examples
Logging
Queries of note
-resource.type="k8s_cluster" -resource.type="http_load_balancer" -resource.type="vpn_gateway" -resource.type="gce_instance" -resource.type="cloudsql_database" -resource.type="pubsub_topic" -resource.type="pubsub_subscription" -resource.type="stackdriver_notification_channel" -resource.type="audited_resource" resource.type="gce_instance_group"
are you over quota somewhre?
There is a quota dashbarod, but what about the log:
protoPayload.status.message="QUOTA_EXCEEDED"
Load Balancer Request logging
Which of your back end have logging turned on, and what is their sample rate?
gcloud compute backend-services list --format json | \ jq '.[] | { name: .name , enable: .logConfig.enable , samplerate: .logConfig.sampleRate}| join(",")'
Getting logs with Python
Sometimes you just can't get what you want from the UI.
So do it yourself:
#!/usr/bin/env python3 # use: # 'gcloud logging logs list' # to list availble logger_names. import json from google.cloud import logging import pprint logger_name = "compute.googleapis.com%2Fhealthchecks" logging_client = logging.Client() logger = logging_client.logger(logger_name) for entry in logger.list_entries(): timestamp = entry.timestamp.isoformat() # uncomment this to figure out the fiels you want. # pprint.pprint(entry) print("{} {}".format( entry.payload['healthCheckProbeResult']['probeSourceIp'],entry.payload['healthCheckProbeResult']['healthState']))
In this case I could not get log metrics.
Reference: https://googleapis.dev/python/logging/latest/entries.html
Log Metrics
Load balancer requests
In this example we are asking:
Which grafana dashboards are being used the most?
First setup a logging metric for l7 load balancer, then:
fetch l7_lb_rule | metric 'logging.googleapis.com/user/lblogmetrics' | filter (resource.forwarding_rule_name == 'k8s-fws-cluster-monitoring-grafana--XXX') && (metric.requesturl =~ 'https://grafana.domain.com/d/.*') | align rate(1m) | every 1m | group_by [re_extract(metric.requesturl, "^https://grafana.domain.com/d/(.*)\\?", r'\1')], [value_lblogmetrics_aggregate: aggregate(value.lblogmetrics)]
Here I've relabels "re_extract..." as "url"
fetch l7_lb_rule | metric 'logging.googleapis.com/user/lblogmetrics' | filter (resource.forwarding_rule_name == 'k8s-fws-cluster-monitoring-grafana--XXX') && (metric.requesturl =~ 'https://grafana.domain.com/d/.*') | align rate(1m) | every 1m | group_by [ url: re_extract(metric.requesturl, "^https://grafana.domain.com/d/(.*)\\?", r'\1')], [value_lblogmetrics_aggregate: aggregate(value.lblogmetrics)]
Note the re_extract function in the group by, for group just the part of the request you care about. some request urls can get quite long, such that they don't fit in graphs well.
You can do this sort of slice and dice for any site in GCP.
Be careful with logging. For very busy sites google logging might be expensive, you can also do logging ratios to get a feel for activity without logging _every_ request.
Monitoring and Alerting
Overquota events
https://cloud.google.com/monitoring/alerts/using-quota-metrics
Find orphaned instances
i.e. instances not owned by a mig are orphans
#!/bin/sh # A. list instance groups # B. list all instance in instance group # C. list all instance # D. substract one from the other. echo A. gcloud compute instance-groups list export INSTANCE_GROUPS=$(mktemp) echo $INSTANCE_GROUPS gcloud compute instance-groups list | grep region | grep -v NAME | awk '{print $1}' > ${INSTANCE_GROUPS} echo B. gcloud compute instance-groups managed list-instances export NOT_ORPHANS=$(mktemp) for i in `cat ${INSTANCE_GROUPS}` do gcloud compute instance-groups managed list-instances $i --region us-central1 | grep -v NAME >> ${NOT_ORPHANS} done echo C. list all instance ALL_INSTANCES=$(mktemp) gcloud compute instances list | awk '{print $1}' > ${ALL_INSTANCES} for i in `cat ${ALL_INSTANCES}` do grep -q $i ${NOT_ORPHANS} || echo orphan $i done
gke instance might show up in this.
Survey
VMs
Redis - "memorystore"
SQL
Also see
Reading
- Hashes and ETags
- Best Practices
- https://cloud.google.com/storage/docs/hashes-etags