##GD
small number of model updates
accurate
each epoch may be expensive
easy to parallelize
##SGD
Requires lots of model updates
Not as accurate, but often good enough
A log of progress in one pass for big data
【GD|GD vs SGD】Not trivial to parallelize