This study aims to develop a BCI utilizing EEG signals during speech perception as training data. To explore the characteristics of sounds, including known vowels and unexplored consonants, we compared EEG signals during perception and imagery for the vowel /a/ and the consonant-vowel /ka/, using ERP and ERSP. N200 and P300 were observed only during perception. However, no significant differences were found between tasks in the 4-8Hz band for /a/ and the 16-20Hz band for /ka/. Conversely, the right frontal beta ERD observed during /a/ imagery disappeared for /ka/. Future work will focus on model construction utilizing common components across both tasks and syllable-specific power fluctuations.