Accurate classification of transients obtained from spectroscopic data are important to understand their nature and discover new classes of astronomical objects. For supernovae (SNe), SNID, NGSF (a Python version of SUPERFIT), and DASH are widely used in the community. Each tool provides its own metric to help determine classification, such as rlap of SNID, chi2/dof of NGSF, and Probability of DASH. However, we do not know how accurate these tools are, and they have not been tested with a large homogeneous data set. Thus, in this work, we study the accuracy of these spectral classification tools using 4646 SEDMachine spectra, which have accurate classifications obtained from the Zwicky Transient Facility Bright Transient Survey (BTS). Comparing our classifications with those from BTS, we have tested the classification accuracy in various ways. We find that NGSF has the best performance (overall Accuracy 87.6% when samples are split into SNe Ia and Non-Ia types), while SNID and DASH have similar performance with overall Accuracy of 79.3% and 76.2%, respectively. Specifically for SNe Ia, SNID can accurately classify them when rlap > 15 without contamination from other types, such as Ibc, II, SLSN, and other objects that are not SNe (Purity > 98%). For other types, determining their classification is often uncertain. We conclude that it is difficult to obtain an accurate classification from these tools alone. This results in additional human visual inspection effort being required in order to confirm the classification. To reduce this human visual inspection and to support the classification process for future large-scale surveys, this work provides supporting information, such as the accuracy of each tool as a function of its metric.